Software Development at Obol
When hardening a projects technical security, team member's operational security, and the security of the software development practices in use by the team are some of the most criticial areas to secure. Many hacks and compromises in the space to date have been a result of these attack vectors rather than exploits of the software itself.
With this in mind, in January 2023 the Obol team retained the expertise of Ethereal Venture's security researcher Alex Wade; to interview key stakeholders and produce a report into the teams Software Development Lifecycle.
The below page is a result of the report that was produced. What is present here has had some sensitive information redacted, and contains responses to the recommendations made, detailing the actions the Obol team have taken to mitigate what has been highlighted.
Obol Report
Prepared by: Alex Wade (Ethereal Ventures) Date: Jan 2023
Over the past month, I worked with Obol to review their software development practices in preparation for their upcoming security audits. My goals were to review and analyze:
- Software development processes
- Vulnerability disclosure and escalation procedures
- Key personnel risk
The information in this report was collected through a series of interviews with Obol’s project leads.
Contents
- Background Info
- Analysis - Cluster Setup and DKG
- Key Risks
- Potential Attack Scenarios
- Recommendations
- R1: Users should deploy cluster contracts through a known on-chain entry point
- R2: Users should deposit to the beacon chain through a pool contract
- R3: Raise the barrier to entry to push an update to the Launchpad
- Additional Notes
- Vulnerability Disclosure
- Key Personnel Risk
Background Info
Each team lead was asked to describe Obol in terms of its goals, objectives, and key features.
What is Obol?
Obol builds DVT (Distributed Validator Technology) for Ethereum.
What is Obol’s goal?
Obol’s goal is to solve a classic distributed systems problem: uptime.
Rather than requiring Ethereum validators to stake on their own, Obol allows groups of operators to stake together. Using Obol, a single validator can be run cooperatively by multiple people across multiple machines.
In theory, this architecture provides validators with some redundancy against common issues: server and power outages, client failures, and more.
What are Obol’s objectives?
Obol’s business objective is to provide base-layer infrastructure to support a distributed validator ecosystem. As Obol provides base layer technology, other companies and projects will build on top of Obol.
Obol’s business model is to eventually capture a portion of the revenue generated by validators that use Obol infrastructure.
What is Obol’s product?
Obol’s product consists of three main components, each run by its own team: a webapp, a client, and smart contracts.
- DV Launchpad: A webapp to create and manage distributed validators.
- Charon: A middleware client that enables operators to run distributed validators.
- Solidity: Withdrawal and fee recipient contracts for use with distributed validators.
Analysis - Cluster Setup and DKG
The Launchpad guides users through the process of creating a cluster, which defines important parameters like the validator’s fee recipient and withdrawal addresses, as well as the identities of the operators in the cluster. In order to ensure their cluster configuration is correct, users need to rely on a few different factors.
First, users need to trust the Charon client to perform the DKG correctly, and validate things like:
- Config file is well-formed and is using the expected version
- Signatures and ENRs from other operators are valid
- Cluster config hash is correct
- DKG succeeds in producing valid signatures
- Deposit data is well-formed and is correctly generated from the cluster config and DKG.
However, Charon’s validation is limited to the digital: signature checks, cluster file syntax, etc. It does NOT help would-be operators determine whether the other operators listed in their cluster definition are the real people with whom they intend to start a DVT cluster. So -
Second, users need to come to social consensus with fellow operators. While the cluster is being set up, it’s important that each operator is an active participant. Each member of the group must validate and confirm that:
- the cluster file correctly reflects their address and node identity, and reflects the information they received from fellow operators
- the cluster parameters are expected – namely, the number of validators and signing threshold
Finally, users need to perform independent validation. Each user should perform their own validation of the cluster definition:
- Is my information correct? (address and ENR)
- Does the information I received from the group match the cluster definition?
- Is the ETH2 deposit data correct, and does it match the information in the cluster definition?
- Are the withdrawal and fee recipient addresses correct?
These final steps are potentially the most difficult, and may require significant technical knowledge.
Key Risks
1. Validation of Contract Deployment and Deposit Data Relies Heavily on Launchpad
From my interviews, it seems that the user deploys both the withdrawal and fee recipient contracts through the Launchpad.
What I’m picturing is that during the first parts of the cluster setup process, the user is prompted to sign one or more transactions deploying the withdrawal and fee recipient contracts to mainnet. The Launchpad apparently uses an npm package to deploy these contracts: 0xsplits/splits-sdk
, which I assume provides either JSON artifacts or a factory address on chain. The Launchpad then places the deployed contracts into the cluster config file, and the process moves on.
If an attacker has published a malicious update to the Launchpad (or compromised an underlying dependency), the contracts deployed by the Launchpad may be malicious. The questions I’d like to pose are:
- How does the group creator know the Launchpad deployed the correct contracts?
- How does the rest of the group know the creator deployed the contracts through the Launchpad?
My understanding is that this ultimately comes down to the independent verification that each of the group’s members performs during and after the cluster’s setup phase.
At its worst, this verification might consist solely of the cluster creator confirming to the others that, yes, those addresses match the contracts I deployed through the Launchpad.
A more sophisticated user might verify that not only do the addresses match, but the deployed source code looks roughly correct. However, this step is far out of the realm of many would-be validators. To be really certain that the source code is correct would require auditor-level knowledge.
The risk is that:
- the deployed contracts are NOT the correctly-configured 0xsplits waterfall/fee splitter contracts
- most users are ill-equipped to make this determination themselves
- we don’t want to trust the Launchpad as the single source of truth
In the worst case, the cluster may end up depositing with malicious withdrawal or fee recipient credentials. If unnoticed, this may net an attacker the entire withdrawal amount, once the cluster exits.
Note that the same (or similar) risks apply to validation of deposit data, which has the potential to be similarly difficult. I’m a little fuzzy on which part of the Obol stack actually generates the deposit data / deposit transaction, so I can’t speak to this as much. However, I think the mitigation for both of these is roughly the same - read on!
Mitigation:
It’s certainly a good idea to make it harder to deploy malicious updates to the Launchpad, but this may not be entirely possible. A higher-yield strategy may be to educate and empower users to perform independent validation of the DVT setup process - without relying on information fed to them by Charon and the Launchpad.
I’ve outlined some ideas for this in #R1 and #R2.
2. Social Consensus, aka “Who sends the 32 ETH?”
Depositing to the beacon chain requires a total of 32 ETH. Obol’s product allows multiple operators to act as a single validator together, which means would-be operators need to agree on how to fund the 32 ETH needed to initiate the deposit.
It is my understanding that currently, this process comes down to trust and loose social consensus. Essentially, the group needs to decide who chips in what amount together, and then trust someone to take the 32 ETH and complete the deposit process correctly (without running away with the money).
Granted, the initial launch of Obol will be open only to a small group of people as the kinks in the system get worked out - but in preparation for an eventual public release, the deposit process needs to be much simpler and far less reliant on trust.
Mitigation: See #R2.
Potential Attack Scenarios
During the interview process, I learned that each of Obol’s core components has its own GitHub repo, and that each repo has roughly the same structure in terms of organization and security policies. For each repository:
- There are two overall github organization administrators, and a number of people have administrative control over individual repositories.
- In order to merge PRs, the submitter needs:
- CI/CD checks to pass
- Review from one person (anyone at Obol)
Of course, admin access also means the ability to change these settings - so repo admins could theoretically merge PRs without needing checks to pass, and without review/approval, organization admins can control the full GitHub organization.
The following scenarios describe the impact an attack may have.
1. Publishing a malicious version of the Launchpad, or compromising an underlying dependency
- Reward: High
- Difficulty: Medium-Low
As described in Key Risks, publishing a malicious version of the Launchpad has the potential to net the largest payout for an attacker. By tampering with the cluster’s deposit data or withdrawal/fee recipient contracts, an attacker stands to gain 32 ETH or more per compromised cluster.
During the interviews, I learned that merging PRs to main in the Launchpad repo triggers an action that publishes to the site. Given that merges can be performed by an authorized Obol developer, this makes the developers prime targets for social engineering attacks.
Additionally, the use of the 0xsplits/splits-sdk
NPM package to aid in contract deployment may represent a supply chain attack vector. It may be that this applies to other Launchpad dependencies as well.
In any case, with a fairly large surface area and high potential reward, this scenario represents a credible risk to users during the cluster setup and DKG process.
See #R1, #R2, and #R3 for some ideas to address this scenario.
2. Publishing a malicious version of Charon to new operators
- Reward: Medium
- Difficulty: High
During the cluster setup process, Charon is responsible both for validating the cluster configuration produced by the Launchpad, as well as performing a DKG ceremony between a group’s operators.
If new operators use a malicious version of Charon to perform this process, it may be possible to tamper with both of these responsibilities, or even get access to part or all of the underlying validator private key created during DKG.
However, the difficulty of this type of attack seems quite high. An attacker would first need to carry out the same type of social engineering attack described in scenario 1 to publish and tag a new version of Charon. Crucially, users would also need to install the malicious version - unlike the Launchpad, an update here is not pushed directly to users.
As long as Obol is clear and consistent with communication around releases and versioning, it seems unlikely that a user would both install a brand-new, unannounced release, and finish the cluster setup process before being warned about the attack.
3. Publishing a malicious version of Charon to existing validators
- Reward: Low
- Difficulty: High
Once a distributed validator is up and running, much of the danger has passed. As a middleware client, Charon sits between a validator’s consensus and validator clients. As such, it shouldn’t have direct access to a validator’s withdrawal keys nor signing keys.
If existing validators update to a malicious version of Charon, it’s likely the worst thing an attacker could theoretically do is slash the validator, however, assuming Charon has no access to any private keys, this would be predicated on one or more validator clients connected to Charon also failing to prevent the signing of a slashable message. In practice, a compromised Charon client is more likely to pose liveness risks than safety risks.
This is not likely to be particularly motivating to potential attackers - and paired with the high difficulty described above, this scenario seems unlikely to cause significant issues.
Recommendations
R1: Users should deploy cluster contracts through a known on-chain entry point
During setup, users should only sign one transaction via the Launchpad - to a contract located at an Obol-held ENS (e.g. launchpad.obol.eth
). This contract should deploy everything needed for the cluster to operate, like the withdrawal and fee recipient contracts. It should also initialize them with the provided reward split configuration (and any other config needed).
Rather than using an NPM library to supply a factory address or JSON artifacts, this has the benefit of being both:
- Harder to compromise: as long as the user knows launchpad.obol.eth, it’s pretty difficult to trick them into deploying the wrong contracts.
- Easier to validate for non-technical users: the Obol contract can be queried for deployment information via etherscan. For example:
Note that in order for this to be successful, Obol needs to provide detailed steps for users to perform manual validation of their cluster setups. Users should be able to treat this as a “checklist:”
- Did I send a transaction to
launchpad.obol.eth
? - Can I use the ENS name to locate and query the deployment manager contract on etherscan?
- If I input my address, does etherscan report the configuration I was expecting?
- withdrawal address matches
- fee recipient address matches
- reward split configuration matches
As long as these steps are plastered all over the place (i.e. not just on the Launchpad) and Obol puts in effort to educate users about the process, this approach should allow users to validate cluster configurations themselves - regardless of Launchpad or NPM package compromise.
Obol’s response
Roadmapped: add the ability for the OWR factory to claim and transfer its reverse resolution ownership.
R2: Users should deposit to the beacon chain through a pool contract
Once cluster setup and DKG is complete, a group of operators should deposit to the beacon chain by way of a pool contract. The pool contract should:
- Accept Eth from any of the group’s operators
- Stop accepting Eth when the contract’s balance hits (32 ETH * number of validators)
- Make it easy to pull the trigger and deposit to the beacon chain once the critical balance has been reached
- Offer all of the group’s operators a “bail” option at any point before the deposit is triggered
Ideally, this contract is deployed during the setup process described in #R1, as another step toward allowing users to perform independent validation of the process.
Rather than relying on social consensus, this should:
- Allow operators to fund the validator without needing to trust any single party
- Make it harder to mess up the deposit or send funds to some malicious actor, as the pool contract should know what the beacon deposit contract address is
Obol’s response
Roadmapped: give the operators a streamlined, secure way to deposit Ether (ETH) to the beacon chain collectively, satisfying specific conditions:
- Pooling from multiple operators.
- Ceasing to accept ETH once a critical balance is reached, defined by 32 ETH multiplied by the number of validators.
- Facilitating an immediate deposit to the beacon chain once the target balance is reached.
- Provide a 'bail-out' option for operators to withdraw their contribution before initiating the group's deposit to the beacon chain.
R3: Raise the barrier to entry to push an update to the Launchpad
Currently, any repo admin can publish an update to the Launchpad unchecked.
Given the risks and scenarios outlined above, consider amending this process so that the sole compromise of either admin is not sufficient to publish to the Launchpad site. It may be worthwhile to require both admins to approve publishing to the site.
Along with simply adding additional prerequisites to publish an update to the Launchpad, ensure that both admins have enabled some level of multi-factor authentication on their GitHub accounts.
Obol’s response
We removed individual’s ability to merge changes without review, enforced MFA, signed commits, and employed Bulldozer bot to make sure a PR gets merged automatically when all checks pass.
Additional Notes
Vulnerability Disclosure
During the interviews, I got some conflicting information when asking about Obol’s vulnerability disclosure process.
Some interviewees directed me towards Obol’s security repo, which details security contacts: ObolNetwork/obol-security, while some answered that disclosure should happen primarily through Immunefi. While these may both be part of the correct answer, it seems that Obol’s disclosure process may not be as well-defined as it could be. Here are some notes:
- I wasn’t able to find information about Obol on Immunefi. I also didn’t find any reference to a security contact or disclosure policy in Obol’s docs.
- When looking into the obol security repo, I noticed broken links in a few of the sections in README.md and SECURITY.md:
- Security policy
- More Information
- Some of the text and links in the Bug Bounty Program don’t seem to apply to Obol (see text referring to Vaults and Strategies).
- The Receiving Disclosures section does not include a public key with which submitters can encrypt vulnerability information.
It’s my understanding that these items are probably lower priority due to Obol’s initial closed launch - but these should be squared away soon! [Obol response to latest vuln disclosure process goes here]
Obol’s response
we addressed all of the concerns in the obol-security repository:
- The security policy link has been fixed
- The Bug Bounty program received an overhaul and clearly states rewards, eligibility, and scope
- We list two GPG public keys for which we accept encrypted vulnerabilities reports.
We are actively working towards integrating Immunefi in our security pipeline.
Key Personnel Risk
A final section on the specifics of key personnel risk faced by Obol has been redacted from the original report. Particular areas of control highlighted were github org ownership and domain name control.
Obol’s response
These risks have been mitigated by adding an extra admin to the github org, and by setting up a second DNS stack in case the primary one fails, along with general Opsec improvements.