In today’s modern world, bugs in software systems incur significant costs. One promising approach to improve software quality is to prove that the software is correct. There has been tremendous progress in the past 20 years on techniques for performing proofs about programs, specifically through a research area called program verification. However, despite these advances, even state-of-the-art automated techniques have limitations, and there are cases where fully automated verification fails, thereby requiring some form of human intervention for the verification to complete. As a result, proofs of correctness about programs often still require some amount of manual effort, whether it’s in the form of annotations, pre-post conditions, loop invariants, or other guidance.
The broad goal of this project is to crowdsource proofs about programs using gamification, the way Foldit crowdsourced protein folding. In other words, we want to turn each proof about a program into a level inside a game, and let people play the game.
Let’s be honest: our long-term goal is a daunting and ambitious one. And so to get things going we started by looking at a narrower research question: can we turn proofs into a game? Not necessarily proofs about programs, but just any kind of proof. This exploration lead to our first result in this area: Polymorphic Blocks. You can find out more about Polymorphic Blocks by clicking on the link (which also includes our CHI 2015 paper on the topic), but the summary for our purpose here: we developed a visual puzzle game in which solving a puzzle is equivalent to finding a proof in natural deduction. Players solved visual puzzles without ever being exposed to the mathematical symbols.
Although players of our game were able to build simple natural deduction proofs, the real question for us was: how can we move towards our ultimate goal, which is to reason about programs using gamification? If we want to move beyond simple proofs, one of the main challenges is that formulas get larger, and it’s not clear that our Polymorphic Blocks visual representation of formulas scales. Also, in Polymorphic Blocks, users have to do every single step of the proof, which gets really tedious. But there are automated tools that can do many of the tedious steps, so we might want to use those tools to help players. So we decided to try a new approach.
Our new approach is rooted in two decisions that put us at a different point in the design space than our previous game. First, we decided to expose the symbols to the user. It turns out, people have a lot of intuition of about numbers, addition, subtraction, and exposing these symbols will allow us to leverage this. Second, instead of focusing on mundane proof details, we decided to focus specifically on coming up with loop invariants, which is one of the hardest parts of program verification.
In particular, we developed a new puzzle game called InvGame where players find loop invariants without being aware of it. Our game displays rows of numbers, and players must find patterns in these numbers. Without players knowing it, the rows in the game correspond to values of variables at each iteration of a loop in a program. Therefore, unbeknown to the player, finding patterns in these numbers corresponds to finding candidate loop invariants. As a result, by simply playing the game, players provide us with candidate loop invariants, without being aware of it, and without program verification expertise. InvGame then checks these candidates with a solver to determine which are indeed loop invariants. We have shown show through an experiment with Mechanical Turk users that players enjoy the game, and are able to solve verification tasks that automated state-of-the-art tools cannot. More details can be found in our CHI 2018 paper below.
Polymorphic Blocks: Formalism-Inspired UI for Structured Connectors (CHI 2015)
with Stephen R. Foster and William G. Griswold
Inferring Loop Invariants through Gamification (CHI 2018)
with Dimitar Bounov, Anthony DeRossi, Massimiliano Menarini, William G. Griswold