Debugging Smart Contracts with SmartACE
Fixing Smart Contracts with SmartACE

Debugging Smart Contracts with SmartACE

By Scott Wesley in collaboration Maria Christakis, Arie Gurfinkel, Xinwen Hu, Jorge Navas, Richard Trefler, and Valentin Wüstholz.

Solidity supports pre- and post-conditions through the require and assert statements. As shown in the previous tutorial, these are sufficient for simple representation invariants. However, for more complex properties, especially between two or more contracts, manual instrumentation becomes difficult and error-prone.

In this tutorial, we show how smart contract invariants can be expressed in temporal logic, converted into monitors, and then instrumented mechanically. This instrumentation will be automated in future versions of SmartACE.

Note: This tutorial assumes all commands are run from within the SmartAce container. All tutorial files are available within the container from the home directory.

Invariants Across Multiple Contracts

The following Solidity program gives a Fund contract which can open() and close(). While open, users can deposit() Ether into the fund. The Manager smart contract constructs a fund for which it is the owner. The fund should open once openFund() is called on Manager.

contract Fund {
    bool isOpen;
    address owner;

    constructor() public { owner = msg.sender; }

    function claim() public {
        owner = msg.sender;
    }

    function open() public {
        require(msg.sender == owner);
        isOpen = true;
    }

    function close() public {
        require(msg.sender == owner);
        isOpen = false;
    }

    function deposit() public payable { require(isOpen); }
}

contract Manager {
    Fund fund;

    constructor() public { fund = new Fund(); }

    function openFund() public { fund.open(); }
}

At first glance it may appear that the balance of fund will not change unless openFund() is called. This is because calls to open() are checked against the current owner. However, ownership exploits are an issue in real smart contracts. Let’s use SmartACE to see if this invariant really holds.

Encoding the Property

The first step in encoding our property is to state it precisely. Our logic of choice is past linear temporal logic (pLTL). We start with an informal statement of the property.

It is always the case that if openFund() is not called even once, then the balance of Manager.fund prior to and after the last transaction remains unchanged.

We can formalize this statement in the VerX Specification Language for Ethereum smart contracts. In this language, FUNCTION is the name of the last method called by a client, while prev(v) is the value of v before FUNCTION was called. once and always pLTL operators, where once(p) is true if p has ever been true while always(p) is true if p has always been true. Using this language, our property becomes:

always(
    !(once(FUNCTION == Manager.openFund()))
    ==>
    (BALANCE(Fund) == prev(BALANCE(Fund)))
)

To detect property violations, we convert the property into a monitor. For brevity, we introduce the following predicates:

The regular expression for the monitor is (unchanged && !called)* called True*. The corresponding automaton is as follows:

The next two sections explain the details of the model, and then walk through instrumenting the monitor. This procedure is currently manual, but will be automated in a future release of SmartACE. The readers less interested in these details can safely skip to the final section.

Understanding the Model

Before we instrument the SmartACE model, we take a detour to understand its structure. Readers only interested in the monitor can skip ahead. Otherwise, start by saving fund.sol and then running:

solc fund.sol --bundle=Manager --c-model --output-dir=fund

This generates several artifacts:

To understand the model, we will focus on cmodel.c. This file is unformatted, but we can improve its presentation with clang-format:

We highlight four parts of the model: the contract encoding, the method encoding, the bundle construction, and the “transaction loop”. For those interested in the model, but do not want to run the tools, the full generated model is available here. The line numbers below refer to that model. We start at line 8, with the contract encoding.

struct Fund {
    sol_address_t model_address;
    sol_uint256_t model_balance;
    sol_bool_t user_isOpen;
    sol_address_t user_owner;
};

struct Manager {
    sol_address_t model_address;
    sol_uint256_t model_balance;
    struct Fund user_fund;
};

A c-struct is generated for each contract. All Solidity primitive types are wrapped by singleton c-struts in primitives.h. Members prefixed by user_ are taken from the smart contract definition, while members prefixed by model_ are generated by SmartACE. If a contract state variable is set exactly once, through a call to new, the new contract is nested inside the calling contract. To illustrate, if we defined a contract struct Manager contract_0, then the balance of fund is contract_0.user_fund.model_address.

Between lines 19 to 84, we then find the encoding of each method. To avoid name collisions, SmartACE renames each method according to the following rules:

We then move on to the entry-point, void run_model(void), at line 85. After some setup code, we find at line 102 the initialization of Manager, and in turn, Manager.fund.

smartace_log("[Initializing contract_0 and children]");
sender.v = nd_range(3, 5, "sender");
value.v = 0;
blocknum.v = nd_increase(blocknum.v, 0, "blocknum");
timestamp.v = nd_increase(timestamp.v, 0, "timestamp");
Init_Manager(&contract_0, sender, value, blocknum, timestamp, paid, origin);

To summarize briefly, each method prefixed by nd_ selects a value non-deterministically. These values are used to model an arbitrary msg.sender, and an increasing block.number and block.timestamp. The addresses 0, 1 and 2 are reserved for address(0), address(Manager), and address(Manager.fund), respectively. Hence, they are not valid senders. However, an attentive reader may notice that we restrict msg.sender to 3 and 4. The importance of this choice, and its impact on correctness, are the topic of the next tutorial. For now we can think of this as a bounded model with two clients, so at the least, our verification gives valid counterexamples.

At line 108 we then find the “transaction loop” which simulates a sequence of transitions:

smartace_log("[Entering transaction loop]");
while (sol_continue()) {
    sol_on_transaction();
    /* ... Select non-deterministic blockchain state ... */
    uint8_t next_call = nd_range(0, 5, "next_call");
    switch (next_call) {
    case 0: {
        smartace_log("[Calling openFund on contract_0]");
        Manager_Method_openFund(&(contract_0), /* Blockchain state */);
        smartace_log("[Call successful]");
        break;
    }
    /* ... Other methods ... */
    }
}

The methods sol_continue() and sol_on_transaction() are runtime hooks provided by libverify/. For Seahorn, sol_continue() is alway true while sol_on_transaction is a no-op. This means that the loop runs forever. On each iteration, a method is selected non-deterministically, and then executed with non-deterministic arguments. The simulation ends if an assertion fails.

Instrumenting the Model

In this section, we will walk through encoding the monitor using three ghost variables and a single assertion. The end result is avialable here. We start by declaring each variable at the beginning of cmodel.c.

GHOST_VAR int g_called_openFund;
GHOST_VAR sol_uint256_t g_pre_balance;
GHOST_VAR sol_uint256_t g_post_balance;

g_called_openFund tracks when Fund.openFund() is called. g_pre_Fund_balance and g_post_Fund_balance store the values of BALANCE(FUND) before and after each transaction. Note that GHOST_VAR is a placeholder for readability, and is ignored by the C compiler.

We return to Manager_Method_openFund on line 77, and set g_called_openFund on entry. This is sufficient, as there is a single instance of Manager.

void Manager_Method_openFund(struct Manager *self, /* Blockchain State */) {
    g_called_openFund = 1;
    Fund_Method_open(&self->user_fund, /* Blockchain State */);
}

We then return to the transaction loop at line 108. We capture the value of contract_1->model_balance before and after each transaction. At the end of each iteration, we assert that unchanged is true, or that called has been observed at least once.

smartace_log("[Entering transaction loop]");
while (sol_continue()) {
    sol_on_transaction();
    g_pre_balance = contract_1->model_balance;
    /* ... Select non-deterministic blockchain state ... */
    uint8_t next_call = nd_range(0, 5, "next_call");
    switch (next_call) {
    case 0: {
        smartace_log("[Calling openFund on contract_0]");
        Manager_Method_openFund(&(contract_0), /* Blockchain state */);
        smartace_log("[Call successful]");
        break;
    }
    /* ... Other methods ... */
    }
    g_post_balance = contract_1->model_balance;
    sol_assert(g_called_openFund || g_pre_balance.v == g_post_balance.v, "Failure.");
}

Debugging the Contract

If we run cmake --build . --target verify on the instrumented contract, we will see that the property does not hold. When Seahorn detects a violation, it generates a counterexample. This counterexample resolves all non-determinism with concrete values, and drives the program to an assertion failure. Thankfully, Seahorn provides these counterexamples as LLVM programs. We can link a counterexample against our executable model to produce a debuggable trace (a witness).

In practice, we could analyze this trace using a debugger such as gdb. However, in this example, it is sufficient to read a trace log from the witness. Let’s reconfigure our example with logging, and then build the witness:

This gives the trace:

[Constructing contract_0 and children]
sender [uint8]: 3
blocknum [uint256]: 0
timestamp [uint256]: 0
[Entering transaction loop]
sender [uint8]: 3
blocknum [uint256]: 0
timestamp [uint256]: 0
next_call [uint8]: 1
[Calling claim() on contract_1]
[Call successful]
sender [uint8]: 3
blocknum [uint256]: 0
timestamp [uint256]: 0
next_call [uint8]: 2
[Calling open() on contract_1]
[Call successful]
sender [uint8]: 3
blocknum [uint256]: 0
timestamp [uint256]: 0
next_call [uint8]: 4
[Calling deposit() on contract_1]
value [uint256]: 1
[Call successful]
[sea] __VERIFIER_error was executed

At first this counterexample might seem surprising. No client should be able to call open() directly. However, this is precisely what happens. To figure out why, we can log the owner of contract_1 during each iteration of the transaction loop. After making these changes we can rerun:

To obtain the following trace:

[Constructing contract_0 and children]
sender [uint8]: 3
[Entering transaction loop]
=> owner of contract_1: 1
sender [uint8]: 3
next_call [uint8]: 1
[Calling claim() on contract_1]
[Call successful]
=> owner of contract_1: 3
sender [uint8]: 3
next_call [uint8]: 2
[Calling open() on contract_1]
[Call successful]
=> owner of contract_1: 3
sender [uint8]: 3
next_call [uint8]: 4
[Calling deposit() on contract_1]
value [uint256]: 1
[Call successful]
[sea] __VERIFIER_error was executed

We can now see how this is possible. When Manager is constructed, it creates a new fund, whose owner is Manager. This aligns with the first owner. Client 3 then claim() and takes ownership. This aligns with the second owner. From here, client 3 is free to open() the contract, and then deposit() a single Ether. The balance of Fund has changed without a call to openFund(). This suggests that all ownership transfers should be approved by the current owner. One solution is to remove claim() and replace it with an owner-only transfer method.

Conclusion

In this tutorial, we learned how to instrument SmartACE models using pLTL specifications. We then saw how this technique could be used to find non-trivial vulnerabilities in a bounded model of the system. Finally, we learned how to use executable counterexamples to debug these vulnerabilities. In the next tutorial we will extend these techniques to unbounded models.

*****
Written by SeaHorn on 08 June 2020