Errors & Resolutions
All operators should try to restart their nodes and should check if they are on the latest stable version before attempting anything other configuration change as we are still in beta and frequently releasing fixes. You can restart and update with the following commands:
docker compose down
git pull
docker compose up
You can check your logs using
docker compose logs
ENRs & Keys
What is an ENR?
An ENR is shorthand for an Ethereum Node Record . It is a way to represent a node on a public network, with a reliable mechanism to update its information.
At Obol we use ENRs to identify charon nodes to one another such that they
can form clusters with the right charon nodes and not impostors. ENRs have
private keys they use to sign updates to the
data contained in their ENR. This
private key is by default found at
.charon/charon-enr-private-key
, and should be kept secure, and
not checked into version control.
An ENR looks something like this:
enr:-JG4QAgAOXjGFcTIkXBO30aUMzg2YSo1CYV0OH8Sf2s7zA2kFjVC9ZQ_jZZItdE8gA-tUXW-rWGDqEcoQkeJ98Pw7GaGAYFI7eoegmlkgnY0gmlwhCKNyGGJc2VjcDI1NmsxoQI6SQlzw3WGZ_VxFHLhawQFhCK8Aw7Z0zq8IABksuJEJIN0Y3CCPoODdWRwgj6E
How do I get my ENR if I want to generate it again?
cd
to the directory where your private keys are located (ex:cd /path/to/charon/enr/private/key
)Run
docker run --rm -v "$(pwd):/opt/charon" obolnetwork/charon:latest enr
. This prints the ENR on your screen.
Please note that this ENR is not the same as the one generated when you created it for the first time. This is because the process of generating ENRs includes the current timestamp.
What do I do if lose my charon-enr-private-key
?
charon-enr-private-key
? -
For now, ENR rotation/replacement is not supported, it will be supported in a future release.
-
Therefore, it's advised to always keep a backup of your
private-key
in a secure location (ex: cloud storage, USB Flash drive etc.)
I can't find the keys anywhere
The
charon-enr-private-key
is generated inside a hidden folder.charon
.To view it, run
ls -al
in your terminal.You can then copy the key to your
~/Downloads
folder for easy access by runningcp .charon/charon-enr-private-key ~/Downloads
. This step maybe a bit different for windows.Else, if you are on
macOS
, pressCmd + Shift + .
to view the.charon
folder in the finder application.
Lighthouse
Downloading historical blocks
This means that Lighthouse is still syncing which will throw a lot of errors down the line. Wait for the sync before moving further.
Failed to request attester duties
error
Failed to request attester duties
errorIndicates there is something wrong with your lighthouse beacon node. This might be because the request buffer is full as your node is never starting consensus since it never gets the duties.
Not enough time for a discovery seach
error
Not enough time for a discovery seach
errorThis could be linked to a internet connection being to slow or relying on a slow third-party service such as Infura.
Beacon Node
Error communicating with Beacon Node API
&
Error while connecting to beacon node event stream
Error communicating with Beacon Node API
&
Error while connecting to beacon node event stream
This is likely due to lighthouse not done syncing, wait and try again once synced. Can also be linked to Teku keystore issue.
Clock sync issues
Either your clock server time is off, or you are talking to a remote beacon client that is super slow (this is why we advise against using services like infura).
My beacon node API is flaky with lots of errors and timeouts
A good quality beacon node API is critical to validator performance. It is always advised to run your own beacon node to ensure low latencies to boost validator performance.
Using 3rd party services like Infura's beacon node API has significant disadvantages since the quality is often low. Requests often return 500s or timeout (Charon times out after 2s). This results in lots of warnings and errors and failed duties. We are working on an issue to mitigate against this, but running a local beacon node is still always preferred. We are not yet considering increasing the 2s timeout since that can have knock-on effects.
Charon
Attester failed in consensus component
error
Attester failed in consensus component
errorThe required number of operators defined in your cluster-lock file is probably not online to sign successfully. Make sure all operators are running the latest version of charon. To check if some peers are not online:
docker logs charon-distributed-validator-node-charon-1 2>&1 | grep 'absent'
Load private key
error
Load private key
errorMake sure you have successfully run a DKG before running the node. The key
should be created and placed in the right directory during the ceremony
Also, make sure you are working in the right directory:
charon-distributed-validator-node
Failed to confirm node connection
error
Failed to confirm node connection
errorWait for Teku & Lighthouse sync to be complete.
Reserve relay circuit: reservation failed
error
Reserve relay circuit: reservation failed
errorRESERVATION_REFUSED
is returned by the libp2p relay when some maximum
limit has been reached. This is most often due to "maximum reservations per IP/peer".
This is when your charon node is restarting or in some error loop and constantly
attempting to create new relay reservations reaching the maximum.
To fix this error, stop your charon node for 30mins before restarting it. This should allow the relay enough time to reset your ip/peer limits and should then allow new reservations. This could also be due to the relay being overloaded in general, so reaching a server wide "maximum connections" limit. This is an issue with relay scalability and we are working in a long term fix for this.
Error opening relay circuit: NO_RESERVATION
error
Error opening relay circuit: NO_RESERVATION
errorError opening relay circuit NO_RESERVATION (204)
indicates the peer
isn't connected to the relay, so the the charon client cannot connect to the
peer via the relay. That might be because the peer is offline or the peer is
configured to connect to a different relay.
To fix this error, ensure the peer is online and configured with the exact
same --p2p-relays
flag.
Couldn't fetch duty data from the beacon node
error
Couldn't fetch duty data from the beacon node
errormsgFetcher
indicates a duty failed in the fetcher component when
it failed to fetch the required data from the beacon node API. This indicates
a problem with the upstream beacon node.
Couldn't aggregate attestation due to failed attester duty
error
Couldn't aggregate attestation due to failed attester duty
errormsgFetcherAggregatorNoAttData
indicates an attestation aggregation
duty failed in the fetcher component since it couldn't fetch the prerequisite
attestation data. This indicates the associated attestation duty failed to obtain
a cluster agreed upon value.
Couldn't aggregate attestation due to insufficient partial v2
committee subscriptions
error
Couldn't aggregate attestation due to insufficient partial v2
committee subscriptions
msgFetcherAggregatorZeroPrepares
indicates an attestation aggregation
duty failed in the fetcher component since it couldn't fetch the prerequisite
aggregated v2 committee subscription. This indicates the associated prepare aggregation
duty failed due to no partial v2 committee subscription submitted by the cluster
validator clients.
Couldn't aggregate attestation due to failed prepare aggregator duty
error
Couldn't aggregate attestation due to failed prepare aggregator duty
msgFetcherAggregatorFailedPrepare
indicates an attestation aggregation
duty failed in the fetcher component since it couldn't fetch the prerequisite
aggregated v2 committee subscription. This indicates the associated prepare aggregation
duty failed.
Couldn't propose block due to insufficient partial randao signatures
error
Couldn't propose block due to insufficient partial randao signatures
msgFetcherProposerFewRandaos
indicates a block proposer duty failed
in the fetcher component since it couldn't fetch the prerequisite aggregated
RANDAO. This indicates the associated randao duty failed due to insufficient
partial randao signatures submitted by the cluster validator clients.
Couldn't propose block due to zero partial randao signatures
error
Couldn't propose block due to zero partial randao signatures
msgFetcherProposerZeroRandaos
indicates a block proposer duty failed
in the fetcher component since it couldn't fetch the prerequisite aggregated
RANDAO. This indicates the associated randao duty failed due to no partial randao
signatures submitted by the cluster validator clients.
Couldn't propose block due to failed randao duty
error
Couldn't propose block due to failed randao duty
errormsgFetcherProposerZeroRandaos
indicates a block proposer duty failed
in the fetcher component since it couldn't fetch the prerequisite aggregated
RANDAO. This indicates the associated randao duty failed.
Consensus algorithm didn't complete
error
Consensus algorithm didn't complete
errormsgConsensus
indicates a duty failed in consensus component. This
could indicate that insufficient honest peers participated in consensus or p2p
network connection problems.
Signed duty not submitted by local validator client
error
Signed duty not submitted by local validator client
errormsgValidatorAPI
indicates that partial signature were never submitted
by the local validator client. This could indicate that the local validator client
is offline, or has connection problems with charon, or has some other problem.
See validator client logs for more details.
Bug: partial signature database didn't trigger partial signature
exchange
error
Bug: partial signature database didn't trigger partial signature
exchange
msgParSigDBInternal
indicates a bug in the partial signature database
as it is unexpected.
No partial signatures received from peers
error
No partial signatures received from peers
errormsgParSigEx
indicates that no partial signature for the duty was
received from any peer. This indicates all peers are offline or p2p network connection
problems.
Insufficient partial signatures received, minimum required threshold
not reached
error
Insufficient partial signatures received, minimum required threshold
not reached
msgParSigDBThreshold
indicates that insufficient partial signatures
for the duty was received from peers. This indicates problems with peers or p2p
network connection problems.
Bug: threshold aggregation of partial signatures failed due to
inconsistent signed data
error
Bug: threshold aggregation of partial signatures failed due to
inconsistent signed data
msgSigAgg
indicates that BLS threshold aggregation of sufficient
partial signatures failed. This indicates inconsistent signed data. This indicates
a bug in charon as it is unexpected.
Existing private key lock file found, another charon instance may be
running on your machine
error
Existing private key lock file found, another charon instance may be
running on your machine
When you turn on the --private-key-file-lock
option in Charon, it
checks for a special file called the private key lock file. This file has the
same name as the ENR private key file but with a .lock
extension.
If the private key lock file exists and is not older than 5 seconds, Charon won't
run. It doesn't allow running multiple Charon instances with the same ENR private
key. If the private key lock file has a timestamp older than 5 seconds, Charon
will replace it and continue with its work. If you're sure that no other Charon
instances are running, you can delete the private key lock file.
Validator api 5xx response: mismatching validator client key share
index, Mth key share submitted to Nth charon peer
error
Validator api 5xx response: mismatching validator client key share
index, Mth key share submitted to Nth charon peer
The issue revolves around an invalid setup or deployment, where keyshares are not tied to the enr private key. There appears to be a mix-up during deployment, leading to a mismatching validator client key share index.
- For example:
Imagine node N is Alice, and node M is Bob, the error would read:
mismatching validator client key share index, Bob's key share submitted
to Alice's charon node.
Bob's private key share(s) are imported to a VC that is connected to Alice's charon node. This is a invalid setup/deployment. Alice's charon node should only be connected to Alice's VC.
Check the keyshare of each node inside cluster-lock.json and see that matches
with the key inside node(num)/validator_keys/keystore-0.json
Teku
Teku keystore file
error
keystore file
error Teku sometimes logs an error which looks like
Keystore file /opt/charon/validator_keys/keystore-0.json.lock already in
use.
This can be solved by deleting the file(s) ending with .lock
in
the folder .charon/validator_keys
. It is caused by an unsafe
shut down of Teku (usually by double pressing Ctrl+C
to shutdown
containers faster).
Grafana
How to fix the grafana dashboard?
Sometimes, grafana dashboard doesn't load any data first time around.You can solve this by following the steps below:
- Click the Wheel Icon > Datasources
- Click prometheus
Change the "Access" field from
Server (default)
toBrowser
. Press "Save & Test". It should fail.Change the "Access" field back to
Server (default)
and press "Save & Test". You should be presented with a green success icon saying "Data source is working" and you can return to the dashboard page.
N/A
& No data
in validator info panel
N/A
& No data
in validator info panelPrometheus
Unauthorized: authentication error: invalid token
Unauthorized: authentication error: invalid token
You can ignore this error unless you have been contacted by the Obol Team with monitoring credentials. In that case, follow
Getting Started Monitoring your Node
in our advanced guides. It does not affect cluster performance or prevent the cluster from running.
Docker
How to fix permission denied
errors?
permission denied
errors? Permission denied errors can come up in a variety of manners, particularly on Linux and WSL for Windows systems. In the interest of security, the charon docker image runs as a non-root user, and this user often does not have the permissions to write in the directory you have checked out the code to. This can be generally be fixed with some of the following:
Running docker commands with
sudo
, if you haven'tsetup docker to be run as a non-root user
.
Changing the permissions of the
.charon
folder with the commands:mkdir .charon
(if it doesn't already exist)sudo chmod -R 666 .charon
I see a lot of errors after running docker compose up
docker compose up
It's because both geth and lighthouse start syncing and so there's
connectivity issues among the containers. Simply let the containers run for
a while. You won't observe frequent errors when geth finishes syncing. You
can also add a second beacon node endpoint for something like infura by
adding a comma separated API URL to the end of
CHARON_BEACON_NODE_ENDPOINTS
in the
docker-compose(./docker-compose.yml#84).
How do I fix the plugin "loki" not found
error?
plugin "loki" not found
error?If you get the following error when calling docker compose up
:
Error response from daemon: error looking up logging plugin loki: plugin
"loki" not found
.
Then it probably means that the Loki docker driver isn't installed. In that case,
run the following command to install loki:
docker plugin install grafana/loki-docker-driver:latest --alias loki
--grant-all-permissions
Relay
Resolve IP of p2p external host flag: lookup replace.with.public.ip.or.hostname:
no such host
error
Resolve IP of p2p external host flag: lookup replace.with.public.ip.or.hostname:
no such host
Replace replace.with.public.ip.or.hostname
in the
relay/docker-compose.yml with your real public IP or DNS hostname.
Timeout resolving bootnode ENR: context deadline exceeded
error
Timeout resolving bootnode ENR: context deadline exceeded
errorThe relay you are trying to connect to your peers via is offline or unreachable.
Lodestar
warn: Potential next epoch attester duties reorg
error
warn: Potential next epoch attester duties reorg
errorLodestar logs these warnings because charon is not able to return proper
dependent_root
value in getAttesterDuties
API
response whenever lodestar calls this API. This is because charon uses
go-eth2-client
for all the beacon API calls and it doesn't
provide dependent_root
value in responses. We have reported
this to them here.