Catch Tricky Bugs in Less Time Using Formal Verification
How to leverage formal verification to get better results from audit competitions
Author:
Armen Ter-AvetisyanIn the ever-evolving landscape of smart contract development, securing code is paramount. Yet, measuring security enhancements or the effectiveness of security services remains an elusive challenge. Formal Verification (FV) contests offer a measurable and effective approach to increasing code security by combining the depth of formal verification with the breadth of the community.
The primary goal of any security service is straightforward: enhance the security of smart contracts. However, the complexity lies in quantifying this security. A traditional security review might not identify any issues, which could lead to a mistaken sense of security. This sense of security hinges on the belief that auditors always deliver consistent results, which overlooks the human elements of variability and error. What is needed is a quantitative way to measure a given security service's effectiveness. While this is difficult for manual and human services like security reviews, it is straightforward for programmatic methods like testing, fuzzing, and formal verification.
One key aspect of quantifying the effectiveness of security services like formal verification is the measurement of coverage. Coverage is a well-known metric commonly used to evaluate the effectiveness of tests and is a crucial component in quantifying the impact of security services. In the context of smart contract security, coverage helps determine the extent to which the codebase has been thoroughly tested and verified, providing insight into the level of confidence one can have in the contract's security.
Traditional methods typically focus on whether a test touches a certain part of the code, which doesn't necessarily guarantee that the test would catch a bug if one existed. To address this shortcoming, we employ alternative methods to measure coverage more effectively. Mutation testing is the primary method utilized to evaluate formal specification, which directly addresses the question, "If a bug existed here, would we catch it?". Automatic methods introduce small changes (mutants) into the code using open-source tools such as Gambit. Mutation testing can be taken a step further by manually adding complex bugs, for example, new functions or larger changes to the code, which more robustly test specifications. By doing so, we get a better idea of coverage and stronger assurances. This method of mutation testing is particularly useful when paired with testing methods that benefit from taking a high-level or abstract view of the code, such as invariant fuzzing and formal verification.
While fuzzing and formal verification have distinct differences, the most critical difference lies in their approach. Fuzzing starts with a blank slate and incrementally builds up a generic environment through various function calls. In contrast, formal verification begins with an arbitrary state and focuses on refining it to realistic scenarios. This difference makes formal verification more compatible with advanced mutation testing. The more generic approach taken by formal verification allows for greater generalization, meaning one property is more likely to catch complex mutants or multiple similar mutants. For example, a property that checks for the conservation of token supply could potentially catch multiple mutants that attempt to manipulate the supply in different ways or a mutant that adds a new way of changing the supply. This ability to catch a wider range of bugs with a single property leads to more meaningful testing, as catching the mutant means that other bugs similar to it would also be caught.
Formal verification also enables various additional methods of coverage measurement. One particularly interesting method, currently in development, is the ability to check which parts of the code are necessary to prove properties, providing a more in-depth view of coverage. This feature may be utilized in future contests.
While manual code reviews have their place in the smart contract security ecosystem, a sustained effort in formal verification can be more impactful in the long run. This is due to two key factors: the ability to measure progress and the continuous engagement of security researchers through the need for explicit statements of all assumptions.
One of the most significant advantages of formal verification is the ability to quantitatively measure progress through coverage metrics. As the adage goes, "If you can't measure it, you can't improve it." Formal verification coverage is one of the best measurements of security, as it directly assesses the effectiveness of the specification in catching potential bugs. This quantifiable aspect allows teams to track their progress over time, identify areas that need more attention, and ultimately achieve a higher level of confidence in the security of their smart contracts.
Moreover, formal verification keeps security researchers engaged by forcing all implicit assumptions to be explicitly stated. In manual code reviews, researchers often rely on their intuition and experience to identify potential issues. However, this approach can lead to overlooked vulnerabilities if certain assumptions are not explicitly considered. Formal verification, on the other hand, requires researchers to specify their assumptions. If any assumptions are left implicit, counterexamples will arise during the verification process, prompting researchers to refine their specifications.
This explicit stating of assumptions also forces thoroughness in the verification process. For example, by considering interesting inputs like the sender being a contract or not assuming relationships between variables, researchers are forced to check these parts of the code more thoroughly. This increased scrutiny makes them more likely to catch bugs that might otherwise go unnoticed in a manual review.
In summary, a long-term formal verification effort can be very impactful by utilizing coverage metrics and continuously engaging security researchers. By leveraging these advantages, teams can achieve a higher level of security in their smart contracts and maintain that security over time. However, our goal is not to secure code in months; we need to do it in weeks. This is where the power of the community and the efficiency of formal verification contests come into play.
Formal verification contests provide a powerful way to compress the time required for thorough code verification without compromising the quality of the results. By harnessing the collective knowledge and creativity of the community, these contests enable rapid improvements in code security through measurable means.
A typical contest begins with a basic setup, including the configuration for the Prover tool and a few example properties. Participants work on verifying the code in their private repositories, which they share with the judges at the end of the competition. At the end of the contest, unknown mutations, representing potential bugs, are introduced into the codebase. These mutations serve as a way to evaluate the effectiveness of the participants' verification efforts, with the most difficult-to-find bugs being valued the most. Once all evaluations are completed, the best specification can be used to set up CI/CD for continuous verification of the strongest properties.
The contest format encourages participants to think creatively and approach the code from different angles. By forcing participants to explicitly state and verify their assumptions and consider a wide range of potential vulnerabilities, formal verification contests help uncover rare bugs and increase confidence in code security.
To ensure the highest quality results, top-performing participants, known as FV Heroes, are called upon at the end of the contest to verify that all mutants were caught by strong properties. This extra stage adds a layer of scrutiny and helps maintain the integrity of the verification process. There will be more details about the FV Hero program in upcoming articles.
High code coverage achieved through formal verification, measured with mutation testing and other advanced methods, is one of the leading ways to quantify and improve code security. By providing measurable increases in code security in a fraction of the time, FV contests are a strategic addition to protocols’ smart contract security toolkit. This approach aligns with the growing trend of crowd-sourced solutions, tapping into the collective intelligence and efficiency of the community to achieve what was once thought only possible through long, solitary endeavors. FV contests are about more than just finding bugs faster; they are about instilling robust confidence in code security.