Please describe your proposed solution.
We have previously built Yogo, which uses automated reasoning to find buggy code in multiple languages. Yogo allows a user to write rules that allow Yogo to recognize many different forms of a high-level concept such as “user-specified value,” and then write queries based on these concepts such as “find any place where a user-specified value is used to choose a currency.” The upshot is that, from a short query, Yogo can find nearly every place in a codebase that semantically exhibits a certain class of buggy behavior, even when it bears no syntactic resemblance. For example, in our PLDI 2020 paper introducing Yogo, we used it to find a bug in Oracle’s Graal codebase (1.2 million lines of Java) which had been missed by a custom static-analyzer designed to find that exact type of bug.
With this grant, we will bring Yogo to Haskell and Plutus, and develop rules and queries for finding common bugs and vulnerabilities in smart contracts.
Yogo is built on the Cubix multi-language infrastructure, which currently supports C, Java, JavaScript, Lua, and Python. We will add Haskell support to Cubix, then add Haskell support to Yogo, and then write several bug-finders using Yogo.
Please describe how your proposed solution will address the Challenge that you have submitted it in.
Fund Statement:
> How can we create a positive developer experience that helps the developer focus on building successful apps?
Having vulnerabilities in apps and needing to buy weeks of auditor time to find them detracts from a positive experience.
Yogo will be able to discover common vulnerabilities and flaws in a fraction of the time of traditional audits.
What are the main risks that could prevent you from delivering the project successfully and please explain how you will mitigate each risk?
There are three phases to the project: adding Haskell support to Cubix, adding Haskell support to Yogo, and writing bug-finding static analyses using Yogo. Each carries its own technical risks. We give a brief overview of the risks and mitigations, and then a highly technical description .
LESS TECHNICAL DESCRIPTION
Haskell and Plutus are substantially different from the languages previously supported by Cubix and Yogo. While there is no concern that they cannot be extended to support Haskell and Plutus, there is a risk that it will be able to share much less work with the existing languages than expected. However, there is always the option of only providing partial support to a language, focusing on getting a large enough subset to cover many smart contracts.
There is also a risk that some classes of bugs will be very difficult to encode into Yogo queries.
HIGHLY TECHNICAL DESCRIPTION
The primary concerns when adding Haskell support to Cubix are its differences from other languages in scoping. We will likely need to implement a new generalize scoping system for Cubix so that it can capture the commonalities between variable declarations in Haskell and other languages while still remaining faithful to each.
The primary concern when adding Haskell support to Yogo are in the use of recursion instead of loops, the use of higher-order functions, and the simultaneity of variable declarations. These all require encodings different from any of Yogo’s prior supported languages. For recursion in particular: The key insight behind program expression graphs (PEGs), the representation used by Yogo, is to create a single “loop” node which represents infinitely many states of the program. It is expected that this insight will generalize to recursion, but not yet proven. It should at least be applicable to the most common subsets of recursion found in smart contracts (which should not be doing unbounded computation).
The primary concern with Plutus in particular is the reliance on exceptions in validators. Yogo currently has poor support for exceptions. However, Ross Tate in his thesis did explain how to elegantly handle exceptions in PEGs, the representation used by Yogo.
For building bug finders: Yogo is very good at finding things which are present, but less good at finding things which are absent. It can still be done, but it is much clumsier and slows development. However, many bug-finders rely on showing that things are absent.