What is Formal Verification?

Proving program correctness is a difficult problem, in smart contracts doing so is essential as the common use of these smart contracts in financial applications means possible stealing of funds if the contracts don't behave as expected. Formal verification is perhaps one of the most robust methods we can use to prove program correctness by determining if a system meets a predefined specification. In essence, formal verification allows us to determine if all possible paths through a system behave as we expect and there are no edge cases causing behavior that we haven’t accounted for.

Fundamentally, formal verification allows us to abstract the code in our system into mathematical formulas that can then be evaluated with an SMT (Satisfiability Modulo Theories) solver which can generate mathematical proofs to determine if the code matches the defined specification.

The core ideas of formal verification are best understood through concrete examples, so we'll first apply the ideas to an abstract coin-operated turnstile to see how we define the expected behavior for our system using a specification, then look at a coded example of applying specifications to verify smart contract behavior.

Example: Coin-Operated Turnstile

We define the core component of our system to be a turnstile that has two possible states in which it can exist: closed or open. This state can be modified using two operations that we define on it: “push” (pushing the turnstile arms) and “insert coin”. We can further say that these operations (and only these operations) will allow us to transition to a new state (if some condition has been met) or stay in the same state (if some condition hasn’t been met) depending on which operation was applied. Formalizing these rules into a visual diagram we get the following:

src: https://www.moritz.systems/blog/an-introduction-to-formal-verification/

We can see that our turnstile should only allow transitions between the two states under the following conditions: if from the locked state a coin is inserted or from the unlocked state one of the arms is pushed. All other possible actions cause the system to remain in the same state. We can call the combination of rules defined in this diagram our specification which we will compare with our system implementation in order to ensure that it behaves correctly.

Our main goal in applying formal verification to this system would be to ensure that it works as expected by only reaching a new state using the possible paths defined in our specification. Something we would want our verification to catch is if, for example, the turnstile were to become unlocked with no coin being inserted or remain unlocked after a push operation was performed; these would indicate there is a bug in our system because our specification was not satisfied. Specifications therefore are the key to verifying our system as they define the behavior it should follow and will allow our proving system to catch any deviations from this expected behavior.

Formal Specifications

In a traditional manual review of a contract, a developer or auditor must identify all possible states that the system may exist in by manually reviewing contracts and how their states are modified via function calls and may implement unit tests to test transitions to and from allowed state. However, this has limitations and often leaves blindspots of possible unexpected state transitions, reverts or state variable values. When applying formal verification to our contracts we therefore want to ensure we catch all of these potential blindspots with our formal specifications.

Formal specifications allow us to define properties (high-level specifications about the system's behavior; e.g. ERC20 token transfers must reduce the sender balance) that the contract must satisfy. We can then test these properties using invariants, which are logical assertions about a contract’s state that must remain true after every state transition. Formal verification therefore allows us to determine whether these invariants hold in a contract and that they aren’t violated (i.e. the logical assertion evaluates to false) when the contract is interacted with. In Certora we create this formal specification using the Certora Verification Language (CVL) which when passed into the Certora Prover with the contract code allows us to verify that the implementation of a contract complies with its specification. If our specification doesn’t match the contract implementation, we’ll be given a counterexample demonstrating a case for which our specification fails.

When creating specifications in CVL we model how contracts transition through different states by calls to program functions that modify the contract state and defining properties that should hold after these transitions. We create specifications using Hoare Logic style properties, which allow us to reason about the correctness of our contract. Syntactically Hoare-style properties are written as {P}c{Q} where c is a program (contract) and P,Q are predicates (logical statements expressing properties that should hold on the system) on the state of c which we call the preconditions and postconditions, respectively.

Preconditions, as the name suggests, let us describe correct conditions that exist before the execution of a function. Postconditions similarly describe the state that the system transitions to if the function execution is as expected. In a Hoare logic system we can define invariants as predicates that are preserved before and after a function has been executed. We can further narrow this specification to only include all reachable states and exclude states that violate our predicate but which are unreachable and therefore uninteresting to us as they don’t tell us anything about their ability to break the system. In CVL, when we use the keyword invariant, we’re referring to inductive invariants which hold in all initial states (e.g: s). Then, after the system undergoes a transition to a new state (e.g: s’) the invariant holds in this state as well.

Invariants allow us to declare properties of the system which must always hold true, in other words no operation on the system should allow it to reach an unexpected state. In a simple ERC20 contract an invariant we could define for the system would be that the sum of account balances can never be greater than the totalSupply of the token. These invariants however don’t allow us to analyze calls to specific functions in the contract and how it may trigger a revert or how it may modify state variables, so verifying something like a call to a transfer function increases the recipient's balance requires a more nuanced check than we can perform with an inductive invariant.

Trace-level properties are what give us the fine grained detail to look at individual operations and how they transition a contract to different states. Using this approach we look at our contracts as state transition systems with predefined states (before and after the transition) along with specific transitions between them that we define using contract function calls and check against certain expectations regarding how they should modify the state (via assertions). In Certora we define these properties using the rule keyword.

Evaluating Specifications

In order to verify a specification for a contract, the Prover directly compares the contract bytecode with the specification file which allows it to verify the code without trusting that the compiler works correctly, and instead verifying the code that’s executed by the EVM directly. The bytecode and properties are translated into mathematical formulas that define the exact conditions under which the program can violate properties defined in the specification. The formulas are then checked using SMT solvers which generate a counterexample if any exist.

Certora uses constraint solving and symbolic execution to analyze smart contracts by executing functions using symbolic values (e.g. x > 10) instead of concrete values (e.g. x = 10). Constraint solving converts execution paths to mathematical formulas with these symbolic values as inputs. The Certora Prover then uses an SMT solver to determine if the path is satisfiable (i.e. there’s a value that satisfies the logical formula which is a constraint generated from the specification file). If such a path exists this is returned to us by the Prover as a counterexample which we can use to see for exactly what values the logical formula is satisfied. This method is beneficial because it allows us to find these values without having to evaluate each individual value (which would be computationally expensive and time consuming) while still allowing us to determine if values within the symbolic range satisfy the formula.

In normal contract testing we pass in sample expected inputs to a contract’s functions for a given state that we expect the contract to be in and the test determines if the resulting behavior holds up to our expectations. This however leaves out the entire set of possible paths (for given inputs that aren't included) and states for which we haven’t created tests which could potentially lead to gaps invisible to the test suite. If however, we’ve defined a formal specification that precisely describes expected contract behavior correctly we can prove mathematically that our contract behaves as expected for all possible input values. This is a much more robust assurance that our contract will behave as intended.

Using specifications written in CVL to verify our contracts acts as an enhanced test suite, allowing us to definitively prove properties on a recurring basis so that even when the contract code is modified via upgrades or refactorings we can check them against existing specifications to ensure no properties are violated.

The Certora Prover

When working with Certora our primary objective is to formalize properties about our smart contract system using the CVL which when fed into the Prover will take care of ensuring that these properties hold for all possible cases that the system could reach. While this may seem like a simple task (that fortunately is made significantly easier by the CVL) we still must ensure that we are creating specifications that in fact allow us to verify all possible edge cases and actually tell us something meaningful about the program being verified.

Invariants (specified using the invariant keyword) in CVL allow us to guarantee that certain properties hold before and after all method calls in a contract. Invariants have the added benefit of being the only method in the CVL that allows us to check properties before and after calls to a constructor for a contract which can't be done with rules.

In addition to invariants we can use the rule keyword in CVL to define specific state transitions that we expect for certain function calls, values returned from function calls and whether or not a function reverted. These can be simpler to write and understand than invariants so in this post our focus will be on their application to creating specifications. We can write invariants as parametric rules but doing so often requires more lines of code and is not checked after a call to the constructor function and so for brevity and reusability we tend to prefer to do so using the invariant keyword.

Rules

The general structures of rules will tend to be something along the lines of:

rule exampleOfARule(uint var) {

// preconditions can be added here to reduce possible branches
// ex: e.msg.sender != 0 filters out cases where the function is called by the 0 address  

// describes the state of the contract before the transition
uint pre_transition_state = getStateFromContract();

// describes the function called in the contract that causes the transition to the next state
stateTransitionFunction(var);

// describes the state of the contract after the transition
uint post_transition_state = getStateFromContract();

// logical statement to verify that the state is as it's expected to be
assert post_transition_state != pre_transition_state
}

In the above example we capture the state of one of the contract's variables before invoking a transition using the pre_transition_state variable. We then invoke a change of state on the contract by calling a function on it that modifies state ( stateTransitionFunction ), we can capture the new state of the contract variable using the post_transition_state variable. We then assert how the state after the transition should be different from the state before the transition.

When we feed this into the prover it symbolically simulates state transitions for all possible values for our undefined input variable var. The post_transition_state is then checked against our logical assertion and if there are values that make our assertion fail the Prover presents these in a counterexample, otherwise we can be sure that for all possible input values of var that our statement holds.

Example: ERC4626 Vault

The example above laid out the basic structure and objectives of rules but their power is best shown through concrete examples on an actual smart contract. For this we will use a contract with an interface that’s familiar to most Solidity devs, the ERC4626 Vault Contract from the OpenZeppelin library. ERC4626 provides a standardized API for tokenized vaults with shares of a vault represented by an ERC20 token; in essence it allows a user to deposit an asset token and receive a share token that represents their corresponding share of the vault proportional to their deposit. The example discussed here is adapted from a recent Certora workshop held at DevConnect Istanbul, the modified example referenced throughout this post is available in the repo here and can be used to follow along. If you’re following along and you don’t already have Certora installed, first you’ll need to install the Prover using the steps provided here.

First we focus on creating a basic specification to prove that the core functionality works, then we’ll see if we can use the Prover to identify the well-known share inflation attack that ERC4626 vaults are susceptible to.

Before creating the actual specification for our vault it's useful to define the properties that we’ll be testing to serve as a guide for writing our specification. There is an existing github repo that provides us with properties for ERC4626 vaults which we can use for defining the behavior of our vault. In this example we'll only be focusing on two properties to demonstrate core functionality and illustrate the share inflation attack but if we were fully verifying the vault contract we'd want to cover all the properties outlined in the previously mentioned repo.

We'll start off using the following properties provided in the repo's README:

Deposit/Mint by user must increase totalSupply
Withdraw/Redeem by user must decrease totalSupply

Let's break down the specification file (Vault.spec) to see how we go about verifying these properties. First, we can explicitly define the contract methods which we'll be calling on the contract in the methods block:

using ERC20Mock as asset;

methods {
    function totalSupply() external returns uint256 envfree;
    function redeem(uint256, address, address) external returns (uint256);
    function mint(uint256, address) external returns (uint256);
}

while not strictly necessary because the Prover can still call contract methods not defined in this block it helps us document the expected interface of a contract. In the above we've only defined the interface for the functions that we'll be using in the rules we're going to write. For methods that use some global Solidity variables (such as msg.sender or block.timestamp) CVL groups these global variables into an object of type env which must be explicitly passed as the first argument into the function to make these global variables accessible. Functions default to expect to have an env variable passed into them and will only not require it if they are declared envfree which can only be done in the function interface definition in the methods block.

The line above the methods block using ERC20Mock as asset allows us to access methods from an ERC20 contract that we've added to our "scene". Since the asset that users deposit into the vault are expected to implement this standard ERC20 interface, this allows us to call functions on the asset contract directly from within our specification.

After having declared the interface for the contract functions we'll be calling, we can write our first rule to prove the first property we outlined above (deposit/mint by user must increase totalSupply):

rule mintIncreasesTotalShareSupply(uint256 shares, address receiver) {
    env e;
    mathint total_supply_before = totalSupply();

    // call the mint function in the contract
    mint(e, shares, receiver);

    // totalSupply of shares after minting
    mathint total_supply_after = totalSupply();

    // minting increases the totalSupply of shares in the system
    assert total_supply_after == total_supply_before + shares, "total supply should increase after minting";
}

In this rule we perform a simple check to ensure that calling the mint function in the contract will increase the totalSupply of share tokens.

We can then write a rule to check our second property (withdraw/redeem by user must decrease totalSupply):

rule redeemDecreasesTotalShareSupply(uint256 shares, address receiver, address owner) {
    env e;
    require receiver != currentContract && owner != currentContract;
    
    mathint total_supply_before_redemption = totalSupply();
    // filters out unreachable states with shares greater than the totalSupply
    require total_supply_before_redemption >= to_mathint(shares);

    redeem(e, shares, receiver, owner);

    mathint total_supply_after_redemption = totalSupply();

    // assert that the totalSupply of shares should be decreased by redeemed amount of shares
    assert total_supply_after_redemption == total_supply_before_redemption - shares, "total supply of shares should be decreased by amount of redeemed shares";
}

This rule checks that redeeming shares properly decreases the totalSupply.

We can run these two rules as a job for the Prover using the following command from the ERC4626_Workshop directory:

certoraRun certora/conf/Vault.conf

Which runs the Prover with the following configuration file:

{
  "files": [
    "src/mocks/ERC4626Mock.sol",
    "src/mocks/ERC20Mock.sol"
  ],
  "link": ["ERC4626Mock:_asset=ERC20Mock"],
  "verify": "ERC4626Mock:certora/specs/Vault.spec",
  "send_only": true,
  "optimistic_loop": true,
  "rule_sanity": "basic",
  "msg": "Verification of ERC4626",
  "solc": "solc-0.8.20",
}

The configuration file simplifies the inline arguments we need to pass via the command line when we run the spec file and makes it easier to share these configurations with other collaborators. This submits the job to the Certora cloud service which actually does the computationally expensive part of verifying that our program is correct (or creating a counterexample if it isn't).

After evaluating this specification the prover outputs the following result:

From this we can be confident that our contract matches the specification and that the Prover could therefore not find any counterexamples for it. Note that this only means that our contract matches the specification, which implies that if our contract specification is incorrect, there are still potential edge cases that could break the properties that we've outlined. We will see the results of this in the next section and why comprehensive specifications are important.

Price Inflation Attack

With the two rules above we can see that we've proven the basic functionality of our vault: when a user calls mint they increase the totalSupply of shares of the vault in proportion to the amount of assets they've deposited at a given share price. When they call redeem they decrease the totalSupply and receive assets proportional to the number of shares they redeemed with the price of shares being calculated as (assets*totalSupply)/totalAssets as we can see from the _convertToShares function:

function _convertToShares(uint256 assets, Math.Rounding rounding) internal view virtual returns (uint256) {
        uint256 supply = totalSupply();

        return 
              (assets == 0 || supply == 0)  ?  _initialConvertToShares(assets, rounding) :  assets.mulDiv(supply, totalAssets(), rounding);
}

This leads us into the share price inflation attack that we’ll look at next. Working through possible execution paths of this function we can see that the price calculation for the initial depositor uses the _initialConvertToShares function which just returns the assets deposited as the number of shares that the depositor receives:

function _initialConvertToShares(uint256 assets,Math.Rounding /*rounding*/) internal view virtual returns (uint256 shares) {
        return assets;
}

An attacker could therefore exploit this logic by being the first to deposit a small amount (say 1 wei) as the initial deposit for which they receive the corresponding shares. They then can make a “donation” of more assets to the vault which causes the numerator in the _convertToShares function to be very large without a proportional increase in the totalSupply denominator, if the number of donated assets is sufficiently high, subsequent depositor’s share price calculation is rounded down to 0 and they therefore have lost their funds to the vault without any way to recover them. The attacker, as the only shareholder, can therefore redeem their overvalued single share by calling the redeem function and receive the entire asset balance of the vault:

function redeem(uint256 shares, address receiver, address owner) public virtual override returns (uint256) {
        require(shares <= maxRedeem(owner), "ERC4626: redeem more than max");

        uint256 assets = previewRedeem(shares);
        _withdraw(_msgSender(), receiver, owner, assets, shares);

        return assets;
    }

The above function eventually calls the _convertToAssets function within previewRedeem to calculate the amount of assets received:

function _convertToAssets(uint256 shares, Math.Rounding rounding) internal view virtual returns (uint256) {
        uint256 supply = totalSupply();

        return
            (supply == 0) ? _initialConvertToAssets(shares, rounding) : shares.mulDiv(totalAssets(), supply, rounding);
}

this results in the calculation of share price as (shares*totalSupply)/totalAssets, the attacker therefore makes off with the entire asset balance of the vault. We can see that this attack slipped through our initial specification because we had no check that asserts that shares > 0 so the totalSupply could end up not actually increasing after a deposit and make this attack possible.

With an understanding of the attack mechanism we can implement a rule that allows us to catch this type of attack. This rule uses the security property defined in the previously mentioned README:

Accounting system must not be vulnerable to share price inflation attacks

If we take this statement and analyze what it means from the perspective of state transitions of our system we can rephrase it as "it should not be possible for a user to receive more assets than they've deposited" as this would imply that the redemption price has been manipulated such that it's more favorable for a given user. We can convert this plaintext definition into a rule like so:

rule shareInflationAttack(uint256 assets, address deposit_receiver, address redeem_receiver, address redeem_owner) {
    env e;
    address attacker = e.msg.sender;

    require(balanceOf(attacker) == 0);
    require(balanceOf(deposit_receiver) == 0);
    require(balanceOf(redeem_receiver) == 0);
    require(balanceOf(redeem_owner) == 0);
    require(attacker != currentContract);

    uint256 shares = deposit(e, assets, deposit_receiver);

    // asset balance of the Vault contract for illustrative purposes
    uint256 vault_asset_balance = asset.balanceOf(currentContract);

    uint256 receivedAssets = redeem(e, shares, redeem_receiver, redeem_owner);

    assert receivedAssets <= assets, "The attacker gained assets.";
}

We run this with the following command:
certoraRun certora/conf/InflationAttack.conf

Which runs the following configuration file:

{
  "files": [
    "src/mocks/ERC20Mock.sol",
    "src/mocks/ERC4626Mock.sol"
  ],
  "verify": "ERC4626Mock:certora/specs/ERC4626-InflationAttack.spec",
  "link": ["ERC4626Mock:_asset=ERC20Mock"],
  "prover_args":[
    "-smt_hashingScheme plainInjectivity",
    "-solvers [yices,z3]"
  ],
  "server": "production",
  "solc": "solc-0.8.20",
  "rule_sanity": "basic",
  "send_only": true,
  "msg": "Share inflation attack on ERC4626",
}

When we look at the output of running this rule through the Prover we get the following counterexample:

We can see that after calling the redeem function, the user who initially deposited 1 of the vault asset receives the entire asset balance of the vault (0x8000000000000000000000000000000000000000000000000000000000000002). This provides us proof that our vault is susceptible to the share price inflation attack and we must therefore modify our contract logic to remove this vulnerability in order for the shareInflationAttack rule to pass.

This serves as a demonstration of how robust formal verification is, as we were able to prove that our contract can reach an unfavorable state which could lead to a loss of user funds without having to explicitly define the user actions that create the vulnerable state as we would with a unit test. We also similarly didn’t have to define a range of possible execution values that our functions may be called with as we would with fuzz testing. Yet we have still managed to uncover what is a major vulnerability by simply modeling possible user flows, subsequent details about states that may violate our conditions were completely handled by the Prover. It’s also worth noting that this bug could similarly have been caught by a rule that asserts that a user should always receive a number of shares greater than 0 after depositing, although preventing this could still leave other similar attack vectors open.

Ultimately our goal with writing specifications is to not have to explicitly define attack scenarios with them but instead create sufficiently general specifications that cover all possible user actions such that any potentially exploitative state transitions are covered. We should also be cognizant that formal verification is only as good as the specification that we write for it, so just because we think that our specifications provide coverage of all possible paths doesn't mean that this is the case.

The takeaway here is that formal verification is an extremely powerful tool to allow us to prove program correctness, however we must properly implement specifications in order to ensure they are truly verifying all possible states that the contract could reach and not only a limited subset that we have biased it into. A good rule of thumb is that we want to make our tests as general as possible, with as few require statements as possible, so that we can reach all of these possible states.

Many thanks to Armen and Uri from Certora for their feedback and support for this article.

Sources & Further Reading

Was it useful? Help us to improve!

With your feedback, we can improve the letter. Click on a link to vote:

An Introduction to Formal Verification With Certora