Let’s be clear: abstraction is not about indirection, nor is it the process of giving things names, nor is it even the packaging of code into reusable modules. Informally, abstraction is the merely the *elimination of detail*.

When we discuss the semantics of a particular program, we are really discussing what is called the semantic *model* of a program. A model is a mathematical object that captures the interesting aspects of what the program *means*. While there are a variety of models for various domains, languages, and objectives, a common choice for a semantic model of an imperative program is some kind of *state transformer*, which describes the *transitions* possible from a given initial state to some final state(s).

To eliminate detail on a model such as this requires a way to be imprecise about some aspects of a program — usually, this takes the form of non-determinism. Non-determinism can be hard for beginners to grasp, but it typically has to be employed when modelling real programs. For example, suppose we had a greeting program that differed depending on the physical location of the computer^{2}:

If we wanted to mathematically model the behaviour of this program, it would be frightfully inconvenient to include the geography of Earth and the computer’s physical location in our model. That’s where non-determinism comes in. If we *abstract* away from the geographical details, and instead regard the program as choosing between the two options based on some *unspecified criteria*, then we can get away with modelling less, at the cost of some detail:

Such underspecified conditionals are usually called *non-deterministic choice*, where is written simply as .

While we tend to view our physical machines as deterministic automata, the state upon which each decision is deterministically made includes a number of external things which are tedious to model mathematically. We can also use non-determinism to ignore details that we don’t care about for our particular domain — a common example is memory allocation, where it is very convenient (for some programs) to think of memory as infinite, and allocation as an operation that can potentially fail, without specifying exactly when and how this failure can occur. This is normally modelled as a non-deterministic choice between successful allocation and a failure state.

In a total, deterministic setting, we might model semantics of a program as a total function — given an initial state, there will be exactly one final state determined entirely by the initial state. But, with non-determinism, each use of the choice operator potentially doubles the number of final states^{3}. So, with non-determinism in our language, the semantics of a program are given as a binary *relation* on states: a mapping from initial states to *every possible* final state. For our purposes, we will define a *state* as just a mapping from variable names to their values. We shall call the set of all states .

Before we go any further, let’s define a little language that we can use for our programs. For simplicity, we will assume that all our variables contain integers. First I’ll introduce the syntax, and then I’ll discuss the semantics of each form separately.

Here we use and to denote simple boolean propositions and arithmetic expressions respectively. These expressions may mention our program variables, so we will assume the existence of a simple semantics for them. For the arithmetic expressions, they are interpreted as a function that, given a state, will produce a resultant integer:

For boolean propositions, the semantics are simply the set of states where the proposition holds:

As mentioned in the previous section, the semantics of a given program will be a binary *relation* on states:

For an assignment statement, the final state is the same as the initial state, save that the updated variable is replaced with the result of evaluating the expression with respect to the initial state:

For non-deterministic choice, seeing as contains all the possible state transitions of , and contains all the possible state transitions of , the semantics of is just their union:

We also have a familiar *sequential composition* operator, written as a semicolon as in , which behaves much like the semicolon in C and Pascal. First executing , and then subsequently executing . Formally, this means that a transition can only be made through if there exists an intermediate state resulting from that leads to the final state via :

Where is an operator for forward-composition of relations, defined as:

We also have *guards*, which are programs that do not change the state, but only permit execution when the given boolean condition holds:

We can use the above building blocks to regain the familiar statement:

**Exercise**: Devise a direct semantic definition for statements. Prove that your semantics are equivalent to that of the translation into non-deterministic choice and guards.

Lastly, in any real programming language, we need some mechanism for loops or recursion. For our toy language, we add the very simple *Kleene star*, written , which runs a nondeterministic amount of times. A good intuition is to think of this recursive expansion^{4}:

Semantically, this is the *reflexive, transitive closure* of the semantics of P:

Where superscripting a relation is self-composition:

Here is the identity relation, i.e. .

We can recover the traditional loop using our Kleene star and some carefully placed guards: One in the loop body, to ensure the loop is only run while the guard is true; and one after the loop, to ensure that the loop only finishes when the guard is false:

**Exercise**: Devise a direct semantic definition for loops. Prove that your semantics are equivalent to that of the translation into the Kleene star and guards.

When we transformed our simple greeting program into a nondeterministic choice, we reduced the size of our state model, but doubled the number of possible outcomes for a given initial state. Instead of being able to determine which greeting would be printed, we must now account for both greetings.

This means that the more *abstract* a program is, the *bigger* the semantic relation is. We can say that a program is an *abstraction* of a program iff:

Equivalently, we also say that is a *refinement* of . Refinement is the inverse of abstraction.

Because refinement is just the subset relation on semantics, it forms a bounded lattice, giving us a greatest and least element. The greatest element is the relation that contains all state transitions:

This greatest element is an abstraction of every program, because it is so non-specific that it contains every possible outcome the program could produce.

Conversely, the least element is the relation that does not contain any transitions — representable syntactically with the guard, or the infinite loop:

One common use for abstraction in computer programming is for the *specification*, *verification* and *derivation* of programs.

If we define a *specification* of a program as a pair of a pre- and a post-condition, we could specify something like a factorial program as follows:

Here we are using *specification statements* of the form , where , the pre-condition, and , the post-condition, are referred to collectively as *assertions*.

The specification statement describes a program that, assuming that the pre-condition is true of the initial state, will ensure that the post-condition is true of the final state. Exactly *how* the program gets from the initial state to the final state is left unspecified. We can make these specification statements bona-fide statements in our toy language, and give them a semantics:

Our semantics for a specification statement include every possible transition that satisfies the specification. Therefore, our specification is an *abstraction* of every possible *implementation* of that specification.

A common technique for the derivation of programs is to build a syntactic *refinement calculus*, allowing us to incrementally derive a program from its specification into a less and less abstract version, until we at last have a version suitable for implementation. This process proceeds via formally justified *refinement rules*. Because they are proven to be sound, a correct application of these rules from the specification yields a correct program by construction.

Let us define a miniature refinement calculus for use with our toy language. For a calculus that is actually useful for more real-world programming scenarios, I recommend consulting Carroll Morgan’s great book, *Programming from Specifications*, an online copy of which is available here.

To start with, we will define a syntactic abstraction relation, , which is defined like this^{5}:

Now, we can give rules for introducing each of our language constructs^{6}:

**Exercise**: By translating the above rules into semantics, show that the rules are sound (that is, that the semantics of the RHS is a subset of the semantics of the LHS).

We can also derive rules for our trusty statements and loops:

**Exercise**: Show that these rules are indeed derivable, using the translations provided in the previous section.

Lastly, it is also sometimes necessary to apply logical reasoning to transform assertions during the derivation process. The *consequence* rule, given below, allows us to swap out our assertions for more convenient ones, provided they remain a refinement of the original assertions:

Using our refinement calculus, let’s derive an implementation for our factorial specification:

Firstly, we have to split the code into two parts, firstly to initialise variables and establish the loop invariant (), and the second to actually contain the loop.

Next, we must use the consequence rule, to get the spec statement into the right form for using with the while loop rule. After introducing the loop, we can fill in the body a bit by incrementing the counter:

Here we must use the consequence rule in order to get the meat of the loop body into the right form for the assignment rule.

Lastly, we just initialise our variables in the obvious way to ensure the loop invariant holds initially:

Treating specifications as abstractions of their implementations is a powerful idea. It gives a semantic framework for the gradual, step-by-step derivation of a correct program from its correctness definition.

Moreover, it shows that a common informal definition of abstraction that is bandied about by programmers — the separation of a specification from an implementation — is just an instance of the more general notion of semantic abstraction. If we were to interpret types as a particularly weak form of specification, then we can view type systems as an instance of this technique as well^{7}.

One of the most common techniques for managing complexity in software engineering is that of *data abstraction*. Data abstraction is the process of *hiding* some particular piece of state behind an *interface* or *signature* of abstract *operations*. This allows for a neat separation of concerns. For example, consider this program that only succeeds if a string of parentheses and brackets is balanced:

This version makes use of an abstract *stack* type and four operations: , an initialiser which sets up an empty stack; , a simple predicate which is true iff the stack is empty; , the familiar operation that adds a new element to the top of the stack; and , the inverse of which removes the top element from the stack and returns it. Certainly, the version making use of abstract operations is far more readable than the concrete alternative, swapping the abstract stack for an (infinite-sized) array and an index to the top of the stack^{8}:

Mathematically justifying the above translation is a process called *data refinement*, and a variety of techniques exist. One of the simplest is Reynold’s method. Starting with the abstract program, it proceeds in four steps:

- Add variables to represent the
*concrete*state (in this case and ) Define a

*coupling invariant*— an assertion that relates the abstract and the concrete variables. In our example, if we assume a stack model like the following grammar:Then the coupling invariant relating and can be defined as a recursive predicate like so:

- For each operation that
*writes*to abstract variables, such as ,*add*code to perform the corresponding updates to the concrete variables, such that the coupling invariant is re-established. This step can be formally justified using a program logic such as Hoare logic, which is analogous to the refinement calculus used above, except designed for post-hoc verification rather than derivation of correct programs from specifications. Each operation that

*reads*from abstract variables, such as , is*replaced*with code that reads the same information from the concrete variables. This step should be justified as a direct consequence of the coupling invariant.With all abstract read operations replaced with concrete ones, the abstract write operations are now completely superfluous, and can be removed.

Following the above steps with our original stack-based program will yield the concrete program we devised in terms of arrays. So the method appears to work, but what does data abstraction and data refinement have to do with the notions of abstraction we saw in the previous section?

To be able to talk about data abstraction in terms of semantics, we need a semantic model of a data type. Formally, we consider a data type to consist of:

- A set of
*representation variables*, containing the data of the data type. We write to represent the state space*extended*with these additional variables. - An
*initialiser*(or a*constructor*if you prefer), , which augments the state with a new instance of our data type, introducing our representation variables. - A
*finaliser*(or a*destructor*if you prefer), , which eliminates our representation variables from the state. - For each
*operation name*, we have a relation — simply the semantics of the operation.

Let’s define data types for our abstract stack and our concrete implementation. To make specification easier, we annotate the names of the operations with the external variables they may touch. More elaborate refinement calculi include *frames*, which make this technique a good deal more rigorous.

For our abstract stack, we never explicitly provide an implementation, merely providing specifications. Because, as we discussed before, specifications are in the same semantic domain as our programs, we can use them to provide our abstract data type.

For the concrete data type, we just take the semantics of the code we use to implement each operation.

With both data types, we can start to devise a definition of abstraction between data types.

Any consumer of our data type, such as the bracket-matching program above, can be viewed as the sequential composition of the initialiser, some sequence of operations, followed by the finaliser. A data type is a *refinement* of another if all such sequences are a refinement of the corresponding abstract sequence.

Thus, to show refinement, we must show that, for any operation sequence :

That is, data refinement is “just” program refinement, but for an *arbitrary* program. Next, we’ll look at common ways to prove this statement, and how they generalise syntactic approaches such as Reynold’s method. For a more detailed introduction to this model-oriented version of data refinement, and comparisons to many more refinement techniques, I recommend this great book by W. P. de Roever and Kai Engelhardt (who was one of my teachers).

We would like to prove the above subset obligation using induction on the length of the sequence of operations, but the presence of the initialisers and finalisers makes the induction hypothesis useless, of the form , which does not refer to a subexpression of our goal.

One technique to resolve this is so-called *downward simulation*, where we define a *refinement relation* , and split the above obligation into three parts:

The initialiser establishes the refinement relation:

Each operation preserves the refinement relation:

Finalisers will converge from -related states:

The second part can be generalised into an analogous theorem about sequences, via a neat induction on the length of the sequence:

From here, one can straightforwardly use the first and third lemmas to show that is indeed a refinement of . In this way, we remove those pesky initialisers and finalisers so that we can do induction, and then just tack them on again after the induction is complete.

So, for our stack example, what would our refinement relation look like? It turns out to merely be a relational form of our coupling invariant from Reynold’s method:

In fact, all of Reynold’s method is just an instance of this downward simulation technique.

It turns out that downward simulation, and thus Reynold’s method, is not *complete*, in that one can construct a pair of data types where one refines another, but that a refinement relation cannot be constructed between them. *Upward simulation*, the mirror image of downward simulation, relies instead on an *abstraction relation* and performs induction from the back of the sequence rather than the front. The combination of both upward and downward simulation *is* complete. The proof of this is presented in de Roever and Engelhardt’s book.

Many programming languages provide features that are commonly called *abstraction*. The most common is the *module*, consisting of one or more *types* (usually left *abstract* in the sense that their implementations are hidden) coupled with *operations* on those types. We can consider a module’s *signature* or interface to be an abstract data type in the semantic sense, where any type-correct implementation can be considered a refinement. In this sense, module systems in programming languages make it substantially easier to do the kind of data abstraction I discuss above, as both abstract and concrete versions are in a machine-readable structure. However, the presence of a module system is neither necessary nor sufficient for data abstraction to be possible.

A perhaps more common use of the word *abstraction* in the Haskell community refers to the *λ*-abstraction^{9}. Seeing as *λ*-calculus-based languages have a very different semantic domain, based on Scott domains, I can’t directly relate the notion of *λ*-abstraction to the kind of semantic abstraction I present here. I’d be very interested to see some explanation to see if there is a solid connection between the very *syntactic* notion of abstraction we see in functional languages, where “abstraction” essentially refers to *parameterisation*, and the kinds of semantic abstractions we see elsewhere.

If you enjoyed this article and you’re a UNSW student, this article is a whirlwhind tour of the second-year COMP2111 course, taught by Kai Engelhardt along with yours truly. The course goes into substantially more detail on the *specification* and *derivation* components, including a detailed study of Hoare Logic and Carroll Morgan’s refinement calculus. Feel free to enrol if you’re interested^{10}.

In my undergraduate years, I remember thinking that data abstraction had something to do with header files or object-oriented programming.↩

Further internationalisation is left as an exercise.↩

This is why deterministically simulating a non-deterministic program is exponential complexity in the worst-case.↩

Here is just sugar for the program that does not change the state and always executes successfully, equivalent to the trivially true guard, or an assignment .↩

Because all of the semantic relational operators (, etc.) are -monotone, this relation enjoys all the usual congruence properties. You can refine a small part of a program, and the resultant program will be a refinement of the original whole program.↩

The notation is a substitution, substituting the expression instead of the variable .↩

The view of types as abstract interpretations is expounded in great detail in Cousot’s paper.↩

Doing the refinement to a dynamically-expanding array is too much pain for this article, but feel free to do it as an exercise.↩

A lot of Haskell programmers don’t seem to value semantic abstraction anyway. Perhaps this is a case of anti-modular language features such as type classes making real abstraction fall out of favour. Or perhaps Haskell is already so abstract there’s not much point in further abstraction.↩

Assuming UNSW hasn’t gone to hell, the course isn’t cancelled, and the teaching staff aren’t driven out due to poor management — a big assumption.↩

I have had considerable success at ICFP2016 and co-located events. I will be presenting two papers: one at ICFP, on the Cogent project; and one at TyDe, about my work on proof-automation combinators in Agda. I also co-authored a two page paper that was accepted to the ML workshop, also co-located with ICFP, but I will not be presenting the paper. Details of all three papers are presented below.

Liam O’Connor, Zilin Chen, Christine Rizkallah, Sidney Amani, Japheth Lim, Toby Murray, Yutaka Nagashima, Thomas Sewell, Gerwin Klein

**Refinement through Restraint: Bringing Down the Cost of Verification**

*International Conference on Functional Programming (ICFP)*, Nara, Japan, September 2016.

Available from CSIRO

We present a framework aimed at significantly reducing the cost of verifying certain classes of systems software, such as file systems. Our framework allows for equational reasoning about systems code written in our new language, Cogent. Cogent is a restricted, polymorphic, higher-order, and purely functional language with linear types and without the need for a trusted runtime or garbage collector. Linear types allow us to assign two semantics to the language: one imperative, suitable for efficient C code generation; and one functional, suitable for equational reasoning and verification. As Cogent is a restricted language, it is designed to easily interoperate with existing C functions and to connect to existing C verification frameworks. Our framework is based on certifying compilation: For a well-typed Cogent program, our compiler produces C code, a high-level shallow embedding of its semantics in Isabelle/HOL, and a proof that the C code correctly refines this embedding. Thus one can reason about the full semantics of real-world systems code productively and equationally, while retaining the interoperability and leanness of C. The compiler certificate is a series of language-level proofs and per-program translation validation phases, combined into one coherent top-level theorem in Isabelle/HOL.

Liam O’Connor

**Applications of Applicative Proof Search**

*Workshop on Type-Driven Development*, Nara, Japan, September, 2016.

In this paper, we develop a library of typed proof search procedures, and demonstrate their remarkable utility as a mechanism for proof-search and automation. We describe a framework for describing proof-search procedures in Agda, with a library of tactical combinators based on applicative functors. This framework is very general, so we demonstrate the approach with two common applications from the field of software verification: a library for property-based testing in the style of SmallCheck, and the embedding of a basic model checker inside our framework, which we use to verify the correctness of common concurrency algorithms.

Yutaka Nagashima and Liam O’Connor

**Close Encounters of the Higher Kind: Emulating Constructor Classes in Standard ML**

*Workshop on ML 2016*, Nara, Japan, September, 2016.

Available from CSIRO

]]>We implement a library for encoding constructor classes in Standard ML, including elaboration from minimal definitions, and automatic instantiation of superclasses.

The language I have been working on for the last couple of years with NICTA/Data61, called Cogent, was recently accepted to ICFP 2016, topping off a string of publications at ITP 2016 and ASPLOS 2016.

The ICFP paper, entitled “Refinement Through Restraint: Bringing Down the Cost of Verification”, outlines our approach to the language design, the implementation of its certifying compiler, and the various refinement proofs that it generates.

The paper will be available from NICTA after publication.

We present a framework aimed at significantly reducing the cost of verifying certain classes of systems software, such as file systems. Our framework allows for equational reasoning about systems code written in our new language, Cogent. Cogent is a restricted, polymorphic, higher-order, and purely functional language with linear types and without the need for a trusted runtime or garbage collector. Linear types allow us to assign two semantics to the language: one imperative, suitable for efficient C code generation; and one functional, suitable for equational reasoning and verification. As Cogent is a restricted language, it is designed to easily interoperate with existing C functions and to connect to existing C verification frameworks. Our framework is based on certifying compilation: For a well-typed Cogent program, our compiler produces C code, a high-level shallow embedding of its semantics in Isabelle/HOL, and a proof that the C code correctly refines this embedding. Thus one can reason about the full semantics of real-world systems code productively and equationally, while retaining the interoperability and leanness of C. The compiler certificate is a series of language-level proofs and per-program translation validation phases, combined into one coherent top-level theorem in Isabelle/HOL.

We also were successful at ITP 2016 with our paper “A Framework for the Automatic Formal Verification of Refinement from Cogent to C”, where we present the details of our low-level translation validation refinement framework. This paper is also available from NICTA.

Our language Cogent simplifies verification of systems software using a certifying compiler, which produces a proof that the generated C code is a refinement of the original Cogent program. Despite the fact that Cogent itself contains a number of refinement layers, the semantic gap between even the lowest level of Cogent semantics and the generated C code remains large. In this paper we close this gap with an automated refinement framework which validates the compiler’s code generation phase. This framework makes use of existing C verification tools and introduces a new technique to relate the type systems of Cogent and C.

Lastly, Gernot Heiser recently presented (on behalf of Sidney Amani) our systems paper at ASPLOS 2016, where we outline the implementation, performance, and verification of two case study file systems using Cogent. This paper is available from NICTA here.

We present an approach to writing and formally verifying high-assurance file-system code in a restricted language called Cogent, supported by a certifying compiler that produces C code, high-level specification of Cogent, and translation correctness proofs. The language is strongly typed and guarantees absence of a number of common file system implementation errors. We show how verification effort is drastically reduced for proving higher-level properties of the file system implementation by reasoning about the generated formal specification rather than its low-level C code. We use the framework to write two Linux file systems, and compare their performance with their native C implementations.

This concludes a string of publications for our project in the three main fields it intersects: programming languages (ICFP), systems (ASPLOS) and formal methods (ITP). Congratulations are due to everyone on who contributed.

]]>Work hard, boy, and you’ll find,

One day you’ll have

A blog like mine, blog like mine,

A blog like mine.

— Cat Stevens [Paraphrased]

Over the years many people have asked me to share with them how I developed this website, seeing as it has a variety of very nice features including:

Embedded LaTeX math actually rendered with LaTeX and embedded in the document. This means I can use LaTeX packages like

`tikz`

and`xypic`

for diagrams, as well as well typeset formulas inline with other text with baseline correction. An obligatory demonstration: For inline math, , and for display math:Literate Agda support, such as in this article, with semantic highlighting and hyperlinked jump-to-definition, even for the standard library imports.

Citeproc based bibliography support, which reads BibTeX databases. Most of my technical articles make use of this, for example this article, and the article linked above.

Atom-based feed support (see the little button at the bottom of this page)

Syntax highlighting for most other languages, such as Haskell, visible in the same article linked above.

Up until recently, I was very reluctant to share the code for this website, because, among other things:

LaTeX rendering was accomplished through the use of an ancient, barely working piece of C code I cribbed from the GladTeX installation. It was very system dependent - it required

`latex`

,`dvips`

and`gs`

to be in exactly the right places to work correctly. The formulae were then embedded in the HTML document by postprocessing the output of Pandoc with`tagsoup`

, and detecting any equations in a very ad-hoc way. They were cached using a liberal mix of`unsafePerformIO`

and hacks, which barely worked and typically caused problems if the LaTeX failed to compile.The bibliography support relied on a hacky workaround I had written to provide

`Binary`

instances for an unexposed part of`pandoc-citeproc`

.The site would mysteriously corrupt its own Hakyll cache for no reason.

The Literate Agda compiler was hardcoded against an old version of Agda and was pretty unmaintainable.

However I am pleased to announce that now I have cleaned up all of these problems. The LaTeX rendering has been completely rewritten in Haskell, and is more configurable and useful. It’s provided in my `latex-formulae`

suite of packages, available on GitHub and Hackage under the following three packages:

`latex-formulae-image`

- Basic support for rendering LaTeX formulae to images using actual system LaTeX.`latex-formulae-pandoc`

- Support for rendering LaTeX equations inside pandoc documents, and a standalone executable that can be used as a Pandoc filter.`latex-formulae-hakyll`

- A Hakyll compiler that uses`latex-formulae-pandoc`

, and adds some useful caching for Hakyll`watch`

servers.

The Agda code has been cleaned up and packaged into two useful Hackage packages, also available on GitHub:

`agda-snippets`

- Support for preprocessing*any text*document and replacing Agda code blocks with coloured, hyperlinked source in HTML. It leaves the rest of the text untouched, so it’s possible to then read the output with Pandoc.`agda-snippets-hakyll`

- Adds some basic mechanisms to integrate`agda-snippets`

with Hakyll and Pandoc.

The bibliography workaround is no longer necessary, due to fixes in upstream libraries.

The other features (feeds, highlighting) are all just provided by Hakyll built-in. The source of this website itself is also available on GitHub, in particular the Hakyll code for it may be of interest.

]]>`Vector`

of arbitrary elements.
I approached the development of this library from a formal perspective, devising laws for all the operations and rigorously checking them with QuickCheck. I defined a patch as a series of *edits* to a document, where an *edit* is simply an insertion, deletion, or replacement of a particular vector element.

```
newtype Patch a = Patch [Edit a] deriving (Eq)
data Edit a = Insert Int a
| Delete Int a
| Replace Int a a
deriving (Show, Read, Eq)
```

We have a function, `apply`

, that takes a patch and applies it to a document:

Patches may be structurally different, but accomplish the same thing. For example, a patch that consists of `Delete`

and an `Insert`

may extensionally be equivalent to a patch that does a single `Replace`

, but they are structurally different. To simplify the mathematical presentation here, we define an equivalence relation that captures this *extensional* patch-equivalence:

To define further operations, we must first note that patches and documents form a *category*. A *category* is made up of a class of *objects* (in this case documents), and a class of arrows or *morphisms* (in this case patches). For each object , there must be an *identity morphism* , and for each pair of morphisms and there must be a composed morphism . They must satisfy the following laws:

Left-identity: for any morphism ,

Right-identity: for any morphism ,

Associativity: for any three morphisms , , ,

The category laws laws comprise the first part of our specification. Translating it into Haskell, I made `Patch a`

an instance of `Monoid`

, just for convenience, even though the composition operator is not defined for any arbitrary patches, and therefore patches technically are not a monoid in the algebraic sense.

Then, the above laws become the following QuickCheck properties:

```
forAll (patchesFrom d) $ \a -> a <> mempty == a
forAll (patchesFrom d) $ \a -> mempty <> a == a
forAll (historyFrom d 3) $ \[a, b, c] ->
apply (a <> (b <> c)) d == apply ((a <> b) <> c) d
```

Here, `patchesFrom d`

is a generator of patches with domain document `d`

, and `historyFrom d 3`

produces a sequence of patches, one after the other, starting at `d`

.

In the case of patches and documents, they form what’s called *indiscrete category* or a *chaotic category*, as there exists a single, unique^{1} patch between any two documents. A function to *find* that patch is simply `diff`

, which takes two documents and computes the patch between them, using the Wagner-Fischer algorithm [Wagner and Fischer 1974].

It’s easy to come up with correctness properties for such a function, just by examining its interaction with the identity patch, the composition operator, and `apply`

:

As there exists a patch between any two documents, it follows that for every patch there exists an *inverse patch* such that and . We define a function, `inverse`

, in Haskell:

And we can check all the usual properties of inverses:

```
forAll (patchesFrom d) $ \p -> p <> inverse p == mempty
forAll (patchesFrom d) $ \p -> inverse p <> p == mempty
forAll (patchesFrom d) $ \p -> inverse (inverse p) == p
forAll (patchesFrom d) $ \p -> inverse mempty == mempty
forAll (historyFrom d 2) $ \[p, q] ->
inverse (p <> q) == inverse q <> inverse p
```

We can also verify that the inverse patch is the same that we could have found by `diff`

:

A category that contains inverses for all morphisms is called a *groupoid*. All indiscrete categories (such as patches) are groupoids, as the inverse morphism is guaranteed to exist. Groupoids are very common, and can also be thought of as a group^{2} with a partial composition relation, but I find the categorical presentation cleaner.

So, we have now specified how to compute the unique patch between any two documents (`diff`

), how to squash patches together into a single patch (composition), how to apply patches to a document (`apply`

), and how to compute the inverse of a given patch (`inverse`

). The only thing we’re missing is the crown jewel of patch theory, how to *merge* patches when they diverge.

I came to patch theory from concurrency control research, and not via the patch theory of Darcs [Jacobson 2009], so there are some differences in how I approached this problem compared to how Darcs does.

In their seminal paper [Ellis and Gibbs 1989] on the topic, Ellis and Gibbs define a function that, given a diverging pair of patches and , will produce new patches and , such that the result of and is the same:

They called this approach *operational transformation*, but category theory has a shorter name for it: a *pushout*. A *pushout* of two morphisms and consists of an object and two morphisms and such that . The pushout must also be *universal*, but as our category is indiscrete we know that this is the case without having to do anything.

We can use this pushout, which we call `transform`

, as a way to implement merges. Assuming a document history and an incoming patch from version , where , we can simply `transform`

the input patch against the composition of all the patches , resulting in a new patch that can be applied to the latest document .

Note that just specifying the `transform`

function to be a pushout isn’t quite sufficient: It would be perfectly possible to resolve two diverging patches and by using patches for and for , and they would resolve to the same document, but probably wouldn’t be what the user intended.

Instead, our `transform`

function will attempt to incorporate the changes of into and the changes of into , up to merge conflicts, which can be handled by a function passed in as a parameter to `transform`

:

Then we can add the pushout property as part of our QuickCheck specification:

```
forAll (divergingPatchesFrom d) $ \(p,q) ->
let (p', q') = transformWith const p q
in apply (p <> q') d == apply (q <> p') d
```

If the merge handler is commutative, then so is `transformWith`

:

```
forAll (divergingPatchesFrom d) $ \(p,q) ->
let (p' , q' ) = transformWith (*) p q
(q'', p'') = transformWith (*) q p
in p' == p''
&& q' == q''
```

We can also ensure that `transformWith`

keeps the intention of the input patches by using as one of the diverging patches:

```
forAll (patchesFrom d) $ \ p ->
transformWith (*) mempty p == (mempty, p)
forAll (patchesFrom d) $ \ p ->
transformWith (*) p mempty == (p, mempty)
```

And with that, we’ve specified `patches-vector`

. A patch theory is “just” a small, indiscrete groupoid with pushouts^{3}. We can theoretically account for all the usual patch operations: inversion, composition, merging, `diff`

, and `apply`

, and this gives rise to a spec that is rock solid and machine-checked by QuickCheck.

The full code is available on GitHub and Hackage. Please do try it out!

I also wrote a library, `composition-tree`

(also on Hackage and GitHub), which is similarly thoroughly specified, and is a convenient way to store a series of patches in a sequence, with good asymptotics for things like taking the `mconcat`

of a sublist. I use these two libraries together with `pandoc`

, `acid-state`

and `servant`

to make a basic wiki system with excellent support for concurrent edits, and edits to arbitrary versions. The wiki system is called `dixi`

(also on GitHub and Hackage).

I independently invented this particular flavour of patch theory, but it’s extremely similar to, for example, the patch theory underlying the pijul version control system [see Mimram and Di Giusto 2013], which also uses pushouts to model merges.

Another paper that is of interest is the recent work encoding patch theory inside Homotopy Type Theory using Higher Inductive Types [Angiuli et al. 2014]. HoTT is typically given semantics by ∞-groupoids, so it makes sense that patches would have a natural encoding, but I haven’t read that paper yet.

Also, another paper [Swierstra and Löh 2014] uses separation logic to describe the semantics of version control, which is another interesting take on patch theoretic concepts.

References

Angiuli, C., Morehouse, E., Licata, D.R. and Harper, R., 2014. Homotopical patch theory. In *Proceedings of the 19th Acm Sigplan International Conference on Functional Programming*. ICFP ’14. New York, NY, USA: ACM, pp. 243–256.

Ellis, C.A. and Gibbs, S.J., 1989. Concurrency control in groupware systems. In *Proceedings of the 1989 Acm Sigmod International Conference on Management of Data*. SIGMOD ’89. New York, NY, USA: ACM, pp. 399–407.

Jacobson, J., 2009. *A formalization of darcs patch theory using inverse semigroups*, UCLA Computational and Applied Mathematics.

Mimram, S. and Di Giusto, C., 2013. A categorical theory of patches. *Electronic Notes in Theoretical Computer Science*, 298, pp.283–307.

Swierstra, W. and Löh, A., 2014. The semantics of version control. In *Proceedings of the 2014 Acm International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software*. Onward! 2014. New York, NY, USA: ACM, pp. 43–54.

Wagner, R.A. and Fischer, M.J., 1974. The string-to-string correction problem. *J. ACM*, 21(1), pp.168–173.

Chiefly, they allow us to reason about our program as if all data structures are immutable, with all of the benefits that implies, while the actual implementation performs efficient destructive updates to mutable data structures. This is achieved simply by statically ruling out every program where the difference between the immutable and the mutable interpretations can be observed, by requiring that every variable that refers, directly or indirectly, to a heap data structure, must be used exactly once. As variables cannot be used multiple times, this implies that for any allocated heap object, there is exactly one live, usable reference that exists at any one time. This is called the *uniqueness* property of linear types.

This is a very simple restriction, but it proves a considerable burden when trying to actually write programs. For example, a naïve definition of a linear array would become unusable after just one element was accessed! Other data structures, with complex reference layouts that involve multiple aliasing references and sharing, simply cannot be expressed.

For this reason, when designing the linear systems language Cogent for my research, I allowed parts of the program to be written in unsafe, imperative C, and those C snippets are able to manipulate opaque types that are *abstract* in the purely functional portion. The author of the code would then have to prove that the C code doesn’t do anything too unsafe, that would violate the invariants of the linear type system.

Specifically, Cogent extends the (dynamic) typing relation for values to include *sets of locations* which can be accessed from a value^{1}. For example, the typing rule of for tuple values is:

Observe how we have used these pointer sets to enforce that there is no internal aliasing in the structure. It also gives us the information necessary to precisely state the conditions under which a C program is safe to execute. We define *stores*, denoted , as a partial mapping from a location or pointer to a value .

Assuming a C-implemented function is evaluated with an input value and an input store , the return value and output store must satisfy the following three properties for all locations :

**Leak freedom**- , that is any input reference that wasn’t returned was freed.**Fresh allocation**- , that is every new output reference, not in the input, was allocated in prevously-free space.**Inertia**- , that is, every reference not in either the input or the output of the function has not been touched in any way.

Assuming these three things, it’s possible to show that the two semantic interpretations of linear typed programs are equivalent, even if they depend on unsafe, imperative C code. I called these three conditions together the *frame conditions*, named after the *frame problem*, from the field of knowledge representation. The frame problem is a common issue that comes up in many formalisations of stateful processes. Specifically, it refers to the difficulty of *local reasoning* for many of these formalisations. The state or store is typically represented (as in our Cogent formalisation above) as a large, monolithic blob. Therefore, whenever any part of the state is updated, every invariant about the state must be re-established, even if it has nothing to do with the part of the state that was updated. The above conditions allow us to state that the C program does not affect any part of the state except those it is permitted (by virtue of the linear references it recieved) to modify, thus allowing us to enforce the type system invariants across the whole program.

Presenting such proof obligations in terms of stores and references as described above, however, is extremely tedious and difficult to work with when formally reasoning about imperative programs, particularly if the invariants we are trying to show are initially broken and only later re-established. Typically, imperative programs lend themselves to axiomatic semantics for verification, the most obvious example being Hoare Logic [Hoare 1969], which provides a proof calculus for a judgement written , which states that, assuming the initial state (which maps variables to values) satisfies an assertion , then the resultant state of running on , satisfies .

When our assertions involve references and aliasing, however, Hoare Logic doesn’t buy us much over just reasoning about the operational semantics directly. A variety of ad-hoc operators have to be added to the logic, for example to say that references do not alias, references point to free space, or that references point to valid values. To make this cleaner, we turn instead to the *Separation Logic* [Reynolds 2002]. Separation logic is a variant of Hoare Logic that is specifically designed to accommodate programming with references and aliasing. It augments the state of Hoare Logic with a mutable store , and the following additional assertions:

- A special assertion , which states that the store is empty, i.e if and only if .
- A binary operator , which states that the store is defined at
*exactly one*location, i.e if and only if . - A
*separating conjunction*connective , which says that the store can be split into two disjoint parts and where and . - A
*separating implication*connective , which says that extending the store with a disjoint part that satisfies results in a store that satisfies .

Crucially, Separation Logic includes the *frame rule*, its own solution to the frame problem, where an unrelated assertion can be added to both the pre- and the post-condition of a given program in a separating conjunction:

This allows much the same local reasoning that we desired before: The program can be verified to work for a store that satisfies , but otherwise contains *no other values*. Then that program may be freely used with a *larger* state and we automatically learn, from the frame rule, that any unrelated bit of state cannot affect, and is not affected by the program .

Separation logic makes expressing these obligations substantially simpler. For example, given a program with an input pointers and and output pointers , we can express all three frame conditions as a single triple:

Here is a sketch of a proof that this implies the frame conditions listed above. Assume an input store . Split into disjoint stores and such that . Let the output store of running with be . Note that by the triple above, we have that .

We have by the frame rule that the output of running with the full store is where .

**Leak freedom**- For any arbitrary location , if but then we must show that . As , we know from that and, as they are disjoint, . Therefore, the only way for to be true is if , but as from , we can conclude that .**Fresh allocation**- If but then we must show that . We have from that , and hence . As they are disjoint, so the only way for to be true is if . But, as we know that from and , we can conclude that .**Inertia**- If and , then we can conclude from that and from that . If , then , thanks to the frame rule as shown above. If , then and and therefore we can say that as they’re both undefined.

I think this is a much cleaner and easier way to state the frame conditions.

My next item to investigate is how I might integrate this into a seamless language and verification framework. My current thinking is to take a lambda calculus with linear types and refinement types, and augment it with an imperative embedded language, which allows several of the guarantees of the linear type system to be suspended. The imperative embedded language might resemble the Hoare-state monad [Swierstra 2009], only using Separation Logic rather than Hoare Logic, but I am still figuring out all the details.

References

Hoare, C.A.R., 1969. An axiomatic basis for computer programming. *Communications of the ACM*, 12(10), pp.576–580.

Reynolds, J.C., 2002. Separation logic: A logic for shared mutable data structures. In *Proceedings of the 17th Annual Ieee Symposium on Logic in Computer Science*. LICS ’02. Washington, DC, USA: IEEE Computer Society, pp. 55–74.

Swierstra, W., 2009. A hoare logic for the state monad. In *Theorem Proving in Higher Order Logics*. Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 440–451.

The real formalisation is a bit more complicated, allowing nonlinear

*read-only*pointers as well as linear, writable ones.↩

```
{-# OPTIONS --type-in-type #-}
module 2015-09-10-girards-paradox where
open import Data.Empty
open import Data.Unit
open import Data.Bool
open import Relation.Binary.PropositionalEquality
open import Data.Product
```

Axiomatic set theories such as that of Zermelo and Fraenkel, in their attempt to provide a comprehensive foundation for mathematics, involve several intricate tricks to avoid becoming inconsistent. A suitably naïve set theory is already inconsistent due to the infamous paradoxical set of Russell [1938].

Here we have used *set comprehension* to define as the set of all sets that do not contain themselves. This leads to the question, *does contain *? If is an element of , then it is not, as only contains sets that do not contain themselves. If is not an element of , then it is, as does not contain itself — We have a paradox!

To address this, different foundations take different approaches. Most axiomatic set theories eliminate or restrict the *rule of comprehension*, that is, they don’t allow sets to be constructed from arbitrary predicates. Instead, set comprehension can only be used to describe subsets of already constructed sets. This prevents comprehension from being used above, but it also prevents a lot of other useful constructions, like products or unions! Thus a handful of other axioms to construct sets are added, such as pairing, union, powerset and so on (all nicely explained in Halmos [1960]).

Another axiom, that of *regularity*, says^{1} that there is no infinite sequence such that, for any , . This implies that no set can contain itself, and allows us to build the universe of set theory by *stages*, called *ranks*. At rank zero, no sets exist; at rank one, there is just the empty set; at rank two, there is also the set containing the empty set; and at each following rank, the added sets all contain the sets that are defined at earlier ranks, as shown in the following figure:

The entire universe of set theory can be thought of as the union of the universe at each rank, , a presentation originally due to Zermelo [1930], but commonly attributed to John von Neumann.

This stratification bears remarkable similarity to Russell’s theory of *types* (see Russell [1938]), his own solution to the the paradoxical set , and the distant ancestor of modern type theory.

Indeed, in the intuitionistic type theory of Martin-Löf [1984], the approximate foundation of the Agda proof assistant, we have a heirarchy of types that very much resembles that of von Neumann or Zermelo^{2}:

The rule of *cumulativity*, which is not present in Agda^{3}, but exists in some type theories and languages such as Idris, makes this resemblance even stronger:

This rule implies that like the von Neumann rank , a type is inhabited by every type where .

The differences between the two theories start to emerge when one examines *why* this stratification exists in type theory. In axiomatic set theory, eliminating the axiom of regularity and thus the stratification it implies makes it rather difficult to do induction, but it does not make the theory inconsistent — there have been several *non-well-founded set theories* proposed, such the hyperset theory of Aczel [1988], which do exactly this.

Removing unrestricted set comprehension is enough to avoid Russell’s paradox, as it allows us to distinguish between *formulae* (or predicates) and *sets*. Unlike informal set theory, we cannot construct a set for any given formula. For example, is a valid formula, but is *not* a set.

Type theories are not set theories — they do not have a separate logical formula language, like that of Frege, to serve as a basis for the theory. So, one cannot achieve consistency in type theory by restricting how a set may be constructed from a logical formula. Instead, type theory places restrictions on the kinds of formulae that can be expressed. Rather that rule out paradoxical *sets* representing self-referential propositions, type theory rules out *the propositions themselves*. In such a theory, it is not even well-formed to ask if a set contains itself^{4}.

This restriction is a consequence of the hierarchy mentioned earlier — remove this from type theory, by saying instead that , and the result is more or less equivalent to Falso^{5}. We can show that type theory is inconsistent with this change using Girard’s paradox, which is a generalised encoding of Russell’s paradox for pure type systems. The contradiction derived from this paradox is rather involved, so much so that Martin-Löf himself didn’t realise that it applied to the first version of his type theory. Hurkens [1995] provided a simplification, which is encoded in Agda here.

With inductive types, however, we can use Russell’s paradox directly, by formalising a naïve notion of sets as comprehensions, and using this to derive a contradiction.

For these (interactive) Agda snippets, I have enabled `--type-in-type`

, which removes the predicative heirarchy from the type theory, instead stating that .

```
data SET : Set where
set : (X : Set) → (X → SET) → SET
```

This defines a set (written `set X f`

) as a comprehension over an *carrier type* `X`

and a function `f`

, where the element for a given index value `x : X`

is given by `f x`

. This definition is already using the fact that — normally, a type (of type ) would not be permitted as a parameter to `set`

, which constructs a type of the same size .

The empty set, having no elements, uses the empty type as its carrier :

```
∅ : SET
∅ = set ⊥ ⊥-elim
```

The set containing the empty set, having one element, uses the unit type as its carrier:

```
⟨∅⟩ : SET
⟨∅⟩ = set ⊤ (λ _ → ∅)
```

The next rank, , has two elements, and thus can use `Bool`

as its carrier:

```
⟨∅,⟨∅⟩⟩ : SET
⟨∅,⟨∅⟩⟩ = set Bool (λ x → if x then ∅ else ⟨∅⟩)
```

More sets could be defined using similar techniques, so I will forgo any further definitions.

We can also define the membership operators for our `SET`

type:

```
_∈_ : SET → SET → Set
a ∈ set X f = Σ X (λ x → a ≡ f x)
_∉_ : SET → SET → Set
a ∉ b = (a ∈ b) → ⊥
```

A value of type `a ∈ set X f`

can be thought of as a proof that there exists a value `x : X`

for which the element function `f`

gives `a`

.

Using these operators, we can define Russell’s paradoxical set as follows:

```
Δ : SET
Δ = set (Σ SET (λ s → s ∉ s)) proj₁
```

This is a set which, for its carrier type, uses *pairs* containing a set `s`

and a proof that `s`

does not contain itself. The element function just discards the proof, leaving us with the `SET`

of all `SET`

s that do not contain themselves.

Indeed, we can prove that every set which is in `Δ`

does not contain itself:

```
x∈Δ→x∉x : ∀ {X} → X ∈ Δ → X ∉ X
x∈Δ→x∉x ((Y , Y∉Y) , refl) = Y∉Y
```

A corollary of this is that `Δ`

itself does not contain itself:

```
Δ∉Δ : Δ ∉ Δ
Δ∉Δ Δ∈Δ = x∈Δ→x∉x Δ∈Δ Δ∈Δ
```

But we know that every set which does not contain itself is in `Δ`

:

```
x∉x→x∈Δ : ∀ {X} → X ∉ X → X ∈ Δ
x∉x→x∈Δ {X} X∉X = (X , X∉X) , refl
```

And from this we can derive a contradiction:

```
falso : ⊥
falso = Δ∉Δ (x∉x→x∈Δ Δ∉Δ)
```

I find it very curious that two very different approaches to formalising mathematics end up with much the same stratified character, and for different reasons. Perhaps this Russell-style heirarchy is, kind of like the Church-Turing thesis, a fundamental characteristic of any sufficiently expressive foundation. Something *discovered* rather than *invented*. In the words of Scott [1974]:

The truth is that there is only one satisfactory way of avoiding the paradoxes: namely, the use of some form of the

theory of types. That was at the basis of both Russell’s and Zermelo’s intuitions. Indeed the best way to regard Zermelo’s theory is as a simplification and extension of Russell’s. (We mean Russell’ssimpletheory of types, of course.) The simplification was to make the typescumulative. Thus mixing of types is easier and annoying repetitions are avoided. Once the later types are allowed to accumulate the earlier ones, we can then easily imagineextendingthe types into the transfinite — just how far we want to go must necessarily be left open. Now Russell made his typesexplicitin his notation and Zermelo left themimplicit. [emphasis in original]

The Agda development in this post is taken from one of Thorsten Altenkirch’s lectures, the code of which is available here. The original proof is, as far as I can tell, due to Chad E Brown, who formulated the same thing in Coq.

Aczel, P., 1988. *Non-well-founded sets*, Center for the Study of Language; Information, Stanford University.

Halmos, P., 1960. *Naïve Set Theory*, Van Nostrand.

Hurkens, A.J.C., 1995. A simplification of Girard’s Paradox. In M. Dezani-Ciancaglini & G. Plotkin, eds. *Typed Lambda Calculi and Applications: Proceedings of the 2nd International Conference on Typed Lambda Calculi and Applications (Tlca-95)*. Berlin, Heidelberg: Springer, pp. 266–278.

Martin-Löf, P., 1984. *Intuitionistic Type Theory*, Bibliopolis.

Russell, B., 1938. *Principles of Mathematics* 2nd ed., W.W. Norton.

Scott, D., 1974. Axiomatizing set theory. In *Axiomatic Set Theory (Proceedings of the Symposium on Pure Mathematics, Vol. XIII, Part II, University of California, Los Angeles, California, 1967)*. American Mathematics Society, Providence, R.I., pp. 207–214.

Zermelo, E., 1930. Über grenzzahlen und mengenbereiche: Neue untersuchungen über die grundlagen der mengenlehre. *Fundamenta Mathematicae*, 16, pp.29–47.

This presentation is not the normal one found in textbooks, which is that every non-empty set contains an element that is disjoint from itself, but that presentation is more brain-bending, and is implied by the statement presented here if you include the axiom of dependent choice.↩

Here, is the type given to types, similar to the

*kind*`*`

in Haskell, and is not a reference to the sets of axiomatic set theory.↩Agda makes use of explicit

*universe polymorphism*instead, and I’m still undecided which version of type theory I like better.↩In set theory, it’s a valid question to ask, just the answer is always “no”.↩

*Falso*is a registered trademark of Estatis, Inc. All Rights Reserved.↩

```
module 2015-08-23-verified-compiler where
open import Data.Fin hiding (_+_; _≟_) renaming (#_ to i)
open import Data.Nat hiding (_≟_)
open import Data.Vec hiding (_>>=_; _⊛_)
```

Recently my research has been centered around the development of a self-certifying compiler for a functional language with linear types called Cogent (see O’Connor et al. [2016]). The compiler works by emitting, along with generated low-level code, a proof in Isabelle/HOL (see Nipkow et al. [2002]) that the generated code is a refinement of the original program, expressed via a simple functional semantics in HOL.

As dependent types unify for us the language of code and proof, my current endeavour has been to explore how such a compiler would look if it were implemented and verified in a dependently typed programming language instead. In this post, I implement and verify a toy compiler for a language of arithmetic expressions and variables to an idealised assembly language for a virtual stack machine, and explain some of the useful features that dependent types give us for writing verified compilers.

*The Agda snippets in this post are interactive! Click on a symbol to see its definition.*

One of the immediate advantages that dependent types give us is that we can encode the notion of *term wellformedness* in the type given to terms, rather than as a separate proposition that must be assumed by every theorem.

Even in our language of arithmetic expressions and variables, which does not have much of a static semantics, we can still ensure that each variable used in the program is bound somewhere. We will use indices instead of variable names in the style of de Bruijn [1972], and index terms by the *number of available variables*, a trick I first noticed in McBride [2003]. The `Fin`

type, used to represent variables, only contains natural numbers up to its index, which makes it impossible to use variables that are not available.

```
data Term (n : ℕ) : Set where
Lit : ℕ → Term n
_⊠_ : Term n → Term n → Term n
_⊞_ : Term n → Term n → Term n
Let_In_ : Term n → Term (suc n) → Term n
Var : Fin n → Term n
```

This allows us to express in the *type* of our big-step semantics relation that the environment `E`

(here we used the length-indexed `Vec`

type from the Agda standard library) should have a value for every available variable in the term. In any Isabelle specification of the same, we would have to add such length constraints as explicit assumptions, either in the semantics themselves or in theorems about them. In Agda, the dynamic semantics are extremely clean, unencumbered by irritating details of the encoding:

```
infixl 5 _⊢_⇓_
data _⊢_⇓_ {n : ℕ} ( E : Vec ℕ n) : Term n → ℕ → Set where
lit-e : ∀{n}
-------------
→ E ⊢ Lit n ⇓ n
times-e : ∀{e₁ e₂}{v₁ v₂}
→ E ⊢ e₁ ⇓ v₁
→ E ⊢ e₂ ⇓ v₂
---------------------
→ E ⊢ e₁ ⊠ e₂ ⇓ v₁ * v₂
plus-e : ∀{e₁ e₂}{v₁ v₂}
→ E ⊢ e₁ ⇓ v₁
→ E ⊢ e₂ ⇓ v₂
---------------------
→ E ⊢ e₁ ⊞ e₂ ⇓ v₁ + v₂
var-e : ∀{n}{x}
→ E [ x ]= n
-------------
→ E ⊢ Var x ⇓ n
let-e : ∀{e₁}{e₂}{v₁ v₂}
→ E ⊢ e₁ ⇓ v₁
→ (v₁ ∷ E) ⊢ e₂ ⇓ v₂
---------------------
→ E ⊢ Let e₁ In e₂ ⇓ v₂
```

By using appropriate type indices, it is possible to extend this technique to work even for languages with elaborate static semantics. For example, linear type systems (see Walker [2005]) can be encoded by indexing terms by type contexts (in a style similar to Oleg). Therefore, the boundary between being *wellformed* and being *well-typed* is entirely arbitrary. It’s possible to use relatively simple terms and encode static semantics as a separate judgement, or to put the entire static semantics inside the term structure, or to use a mixture of both. In this simple example, our static semantics only ensure variables are in scope, so it makes sense to encode the entire static semantics in the terms themselves.

Similar tricks can be employed when encoding our target language, the stack machine . This machine consists of two stacks of numbers, the *working* stack and the *storage* stack , and a program to evaluate. A program is a list of *instructions*.

There are six instructions in total, each of which manipulate these two stacks in various ways. When encoding these instructions in Agda, we index the `Inst`

type by the size of both stacks before and after execution of the instruction:

```
data Inst : ℕ → ℕ → ℕ → ℕ → Set where
num : ∀{w s} → ℕ → Inst w s (suc w) s
plus : ∀{w s} → Inst (suc (suc w)) s (suc w) s
times : ∀{w s} → Inst (suc (suc w)) s (suc w) s
push : ∀{w s} → Inst (suc w) s w (suc s)
pick : ∀{w s} → Fin s → Inst w s (suc w) s
pop : ∀{w s} → Inst w (suc s) w s
```

Then, we can define a simple type for programs, essentially a list of instructions where the stack sizes of consecutive instructions must match. This makes it impossible to construct a program with an underflow error:

```
data SM (w s : ℕ) : ℕ → ℕ → Set where
halt : SM w s w s
_∷_ : ∀{w′ s′ w″ s″} → Inst w s w′ s′ → SM w′ s′ w″ s″ → SM w s w″ s″
```

We also define a simple sequential composition operator, equivalent to list append (`++`

):

```
infixr 5 _⊕_
_⊕_ : ∀{w s w′ s′ w″ s″} → SM w s w′ s′ → SM w′ s′ w″ s″ → SM w s w″ s″
halt ⊕ q = q
(x ∷ p) ⊕ q = x ∷ (p ⊕ q)
```

The semantics of each instruction are given by the following relation, which takes the two stacks and an instruction as input, returning the two updated stacks as output. Note the size of each stack is proscribed by the type of the instruction, just as the size of the environment was proscribed by the type of the term in the source language, which eliminates the need to add tedious wellformedness assumptions to theorems or rules.

```
infixl 5 _∣_∣_↦_∣_
data _∣_∣_↦_∣_ : ∀{w s w′ s′}
→ Vec ℕ w → Vec ℕ s
→ Inst w s w′ s′
→ Vec ℕ w′ → Vec ℕ s′
→ Set where
```

The semantics of each instruction are as follows:

- (where ), pushes to .

```
num-e : ∀{w s}{n}{W : Vec _ w}{S : Vec _ s}
-------------------------
→ W ∣ S ∣ num n ↦ n ∷ W ∣ S
```

- , pops two numbers from and pushes their sum back to .

```
plus-e : ∀{w s}{n m}{W : Vec _ w}{S : Vec _ s}
----------------------------------------
→ (n ∷ m ∷ W) ∣ S ∣ plus ↦ (m + n ∷ W) ∣ S
```

- , pops two numbers from and pushes their product back to .

```
times-e : ∀{w s}{n m}{W : Vec _ w}{S : Vec _ s}
-----------------------------------------
→ (n ∷ m ∷ W) ∣ S ∣ times ↦ (m * n ∷ W) ∣ S
```

- , pops a number from and pushes it to .

```
push-e : ∀{w s}{n}{W : Vec _ w}{S : Vec _ s}
--------------------------------
→ (n ∷ W) ∣ S ∣ push ↦ W ∣ (n ∷ S)
```

- (where
), pushes the number at position from the top of onto .*Error:*LaTeX failed:

`This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) (preloaded format=latex) restricted \write18 enabled. entering extended mode (./working.tex LaTeX2e <2016/03/31> Babel <3.9r> and hyphenation patterns for 83 language(s) loaded. (/usr/local/texlive/2016/texmf-dist/tex/latex/base/article.cls Document Class: article 2014/09/29 v1.4h Standard LaTeX document class (/usr/local/texlive/2016/texmf-dist/tex/latex/base/size12.clo)) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the `?' option. (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/local/texlive/2016/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xy.sty (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xy.tex Bootstrap'ing: catcodes, docmode, (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyrecat.tex) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyidioms.tex) Xy-pic version 3.8.9 <2013/10/06> Copyright (c) 1991-2013 by Kristoffer H. Rose <krisrose@tug.org> and others Xy-pic is free software: see the User's Guide for details. Loading kernel: messages; fonts; allocations: state, direction, utility macros; pictures: \xy, positions, objects, decorations; kernel objects: directionals, circles, text; options; algorithms: directions, edges, connections; Xy-pic loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/oberdiek/ifpdf.sty) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyall.tex Xy-pic option: All features v.3.8 (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xycurve.tex Xy-pic option: Curve and Spline extension v.3.12 curve, circles, loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyframe.tex Xy-pic option: Frame and Bracket extension v.3.14 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xycmtip.tex Xy-pic option: Computer Modern tip extension v.3.7 (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xytips.tex Xy-pic option: More Tips extension v.3.11 loaded) loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyline.tex Xy-pic option: Line styles extension v.3.10 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyrotate.tex Xy-pic option: Rotate and Scale extension v.3.8 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xycolor.tex Xy-pic option: Colour extension v.3.11 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xymatrix.tex Xy-pic option: Matrix feature v.3.14 loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xyarrow.tex Xy-pic option: Arrow and Path feature v.3.9 path, \ar, loaded) (/usr/local/texlive/2016/texmf-dist/tex/generic/xypic/xygraph.tex Xy-pic option: Graph feature v.3.11 loaded) loaded)) (/usr/local/texlive/2016/texmf-dist/tex/latex/xcolor/xcolor.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/graphics-cfg/color.cfg) (/usr/local/texlive/2016/texmf-dist/tex/latex/graphics/dvips.def) (/usr/local/texlive/2016/texmf-dist/tex/latex/colortbl/colortbl.sty (/usr/local/texlive/2016/texmf-dist/tex/latex/tools/array.sty)) (/usr/local/texlive/2016/texmf-dist/tex/latex/graphics/dvipsnam.def) (/usr/local/texlive/2016/texmf-dist/tex/latex/xcolor/svgnam.def)) (/usr/local/texlive/2016/texmf-dist/tex/latex/stmaryrd/stmaryrd.sty) (/usr/local/texlive/2016/texmf-dist/tex/latex/ccfonts/ccfonts.sty) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/eulervm.sty) No file working.aux. (/usr/local/texlive/2016/texmf-dist/tex/latex/concmath/ot1ccr.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/uzeur.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/uzeus.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/eulervm/uzeuex.fd) (/usr/local/texlive/2016/texmf-dist/tex/latex/stmaryrd/Ustmry.fd) ! LaTeX Error: \begin{math} on input line 5 ended by \end{document}. See the LaTeX manual or LaTeX Companion for explanation. Type H <return> for immediate help. ... l.7 \end{document} ! Missing $ inserted. <inserted text> $ l.7 \end{document} [1] (./working.aux) ) (\end occurred inside a group at level 1) ### semi simple group (level 1) entered at line 5 (\begingroup) ### bottom level (see the transcript file for additional information) Output written on working.dvi (1 page, 488 bytes). Transcript written on working.log.`

```
pick-e : ∀{w s}{x}{n}{W : Vec _ w}{S : Vec _ s}
→ S [ x ]= n
----------------------------
→ W ∣ S ∣ pick x ↦ (n ∷ W) ∣ S
```

- , removes the top number from .

```
pop-e : ∀{w s}{n}{W : Vec _ w}{S : Vec _ s}
-------------------------
→ W ∣ (n ∷ S) ∣ pop ↦ W ∣ S
```

As programs are lists of instructions, the evaluation of programs is naturally specified as a list of evaluations of instructions:

```
infixl 5 _∣_∣_⇓_∣_
data _∣_∣_⇓_∣_ {w s : ℕ}(W : Vec ℕ w)(S : Vec ℕ s) : ∀{w′ s′}
→ SM w s w′ s′
→ Vec ℕ w′ → Vec ℕ s′
→ Set where
halt-e : W ∣ S ∣ halt ⇓ W ∣ S
_∷_ : ∀{w′ s′ w″ s″}{i}{is}
→ {W′ : Vec ℕ w′}{S′ : Vec ℕ s′}
→ {W″ : Vec ℕ w″}{S″ : Vec ℕ s″}
→ W ∣ S ∣ i ↦ W′ ∣ S′
→ W′ ∣ S′ ∣ is ⇓ W″ ∣ S″
--------------------------
→ W ∣ S ∣ (i ∷ is) ⇓ W″ ∣ S″
```

The semantics of sequential composition is predictably given by appending these lists:

```
infixl 4 _⟦⊕⟧_
_⟦⊕⟧_ : ∀{w w′ w″ s s′ s″}{P}{Q}
→ {W : Vec ℕ w}{S : Vec ℕ s}
→ {W′ : Vec ℕ w′}{S′ : Vec ℕ s′}
→ {W″ : Vec ℕ w″}{S″ : Vec ℕ s″}
→ W ∣ S ∣ P ⇓ W′ ∣ S′
→ W′ ∣ S′ ∣ Q ⇓ W″ ∣ S″
-------------------------
→ W ∣ S ∣ P ⊕ Q ⇓ W″ ∣ S″
halt-e ⟦⊕⟧ ys = ys
x ∷ xs ⟦⊕⟧ ys = x ∷ (xs ⟦⊕⟧ ys)
```

Having formally defined our source and target languages, we can now prove our compiler correct — even though we haven’t written a compiler yet!

One of the other significant advantages dependent types bring to compiler verification is the elimination of repetition. In my larger Isabelle formalisation, the proof of the compiler’s correctness largely duplicates the structure of the compiler itself, and this tight coupling means that proofs must be rewritten along with the program — a highly tedious exercise. As dependently typed languages unify the language of code and proof, we can merely provide the correctness proof: in almost all cases, the correctness proof is so specific, that the program of which it demonstrates correctness can be *derived automatically*.

```
open import Data.Product
open import Function
open import Data.String
open import Data.String.Unsafe
```

We define a compiler’s correctness to be the commutativity of the following diagram, as per Hutton and Wright [2004].

As we have not proven determinism for our semantics^{1}, such a correctness condition must be shown by the conjunction of a *soundness* and *completeness* condition, similar to Bahr [2015].

**Soundness** is a proof that the compiler output is a *refinement* of the input, that is, every evaluation in the output is matched by the input. The output does not do anything that the input doesn’t do.

```
-- Sound t u means that u is a sound translation of t
Sound : ∀{w s} → Term s → SM w s (suc w) s → Set
Sound {w} t u = ∀{v}{E}{W : Vec ℕ w}
→ W ∣ E ∣ u ⇓ (v ∷ W) ∣ E
-----------------------
→ E ⊢ t ⇓ v
```

Note that we generalise the evaluation statements used here slightly to use arbitrary environments and stacks. This is to allow our induction to proceed smoothly.

**Completeness** is a proof that the compiler output is an *abstraction* of the input, that is, every evaluation in the input is matched by the output. The output does everything that the input does.

```
Complete : ∀{w s} → Term s → SM w s (suc w) s → Set
Complete {w} t u = ∀{v}{E}{W : Vec ℕ w}
→ E ⊢ t ⇓ v
-----------------------
→ W ∣ E ∣ u ⇓ (v ∷ W) ∣ E
```

It is this *completeness* condition that will allow us to automatically derive our code generator. Given a term , our generator will return a Σ-type, or *dependent pair*, containing a program called and a proof that is a complete translation of :

```
codegen′ : ∀{w s}
→ (t : Term s)
→ Σ[ u ∈ SM w s (suc w) s ] Complete t u
```

For literals, we simply push the number of the literal onto the working stack:

```
codegen′ (Lit x ) = _ , proof
where
proof : Complete _ _
proof lit-e = num-e ∷ halt-e
```

The code above never explicitly states what program to produce! Instead, it merely provides the completeness proof, and the rest can be inferred by unification. Similar elision can be used for variables, which pick the correct index from the storage stack:

```
codegen′ (Var x) = _ , proof
where
proof : Complete _ _
proof (var-e x) = pick-e x ∷ halt-e
```

The two binary operations are essentially the standard translation for an infix-to-postfix tree traversal, but once again the program is not explicitly emitted, but is inferred from the completeness proof used.

```
codegen′ (t₁ ⊞ t₂) = _ , proof (proj₂ (codegen′ t₁)) (proj₂ (codegen′ t₂))
where
proof : ∀ {u₁}{u₂} → Complete t₁ u₁ → Complete t₂ u₂ → Complete _ _
proof p₁ p₂ (plus-e t₁ t₂) = p₁ t₁ ⟦⊕⟧ p₂ t₂ ⟦⊕⟧ plus-e ∷ halt-e
codegen′ (t₁ ⊠ t₂) = _ , proof (proj₂ (codegen′ t₁)) (proj₂ (codegen′ t₂))
where
proof : ∀ {u₁}{u₂} → Complete t₁ u₁ → Complete t₂ u₂ → Complete _ _
proof p₁ p₂ (times-e t₁ t₂) = p₁ t₁ ⟦⊕⟧ p₂ t₂ ⟦⊕⟧ times-e ∷ halt-e
```

The variable-binding form pushes the variable to the storage stack and cleans up after evaluation exits the scope with .

```
codegen′ (Let t₁ In t₂)
= _ , proof (proj₂ (codegen′ t₁)) (proj₂ (codegen′ t₂))
where
proof : ∀ {u₁}{u₂} → Complete t₁ u₁ → Complete t₂ u₂ → Complete _ _
proof p₁ p₂ (let-e t₁ t₂)
= p₁ t₁ ⟦⊕⟧ push-e ∷ (p₂ t₂ ⟦⊕⟧ pop-e ∷ halt-e)
```

We can extract a more standard-looking code generator function simply by throwing away the proof that our code generator produces.

```
codegen : ∀{w s}
→ Term s
→ SM w s (suc w) s
codegen {w}{s} t = proj₁ (codegen′ {w}{s} t)
```

We use an alternative presentation of the soundness property, that makes explicit several equalities that are implicit in the original formulation of soundness. We prove that our new formulation still implies the original one.

```
open import Relation.Binary.PropositionalEquality
Sound′ : ∀{w s} → Term s → SM w s (suc w) s → Set
Sound′ {w} t u = ∀{E E′}{W : Vec ℕ w}{W′}
→ W ∣ E ∣ u ⇓ W′ ∣ E′
------------------------------------------
→ (E ≡ E′) × (tail W′ ≡ W) × E ⊢ t ⇓ head W′
sound′→sound : ∀{w s}{t}{u} → Sound′ {w}{s} t u → Sound t u
sound′→sound p x with p x
... | refl , refl , q = q
```

As our soundness proof requires us to do a lot of rule inversion on the evaluation of programs, we need an eliminator for the introduction rule `_⟦⊕⟧_`

, used in the completeness proof, which breaks an evaluation of a sequential composition into evaluations of its component parts:

```
⊕-elim : ∀{w s w′ s′ w″ s″}
{W : Vec ℕ w}{S : Vec ℕ s}
{W″ : Vec ℕ w″}{S″ : Vec ℕ s″}
{a : SM w s w′ s′}{b : SM w′ s′ w″ s″}
→ W ∣ S ∣ a ⊕ b ⇓ W″ ∣ S″
→ ∃[ W′ ] ∃[ S′ ] ((W ∣ S ∣ a ⇓ W′ ∣ S′) × (W′ ∣ S′ ∣ b ⇓ W″ ∣ S″))
⊕-elim {a = halt} p = _ , _ , halt-e , p
⊕-elim {a = a ∷ as} (x ∷ p) with ⊕-elim {a = as} p
... | _ , _ , p₁ , p₂ = _ , _ , x ∷ p₁ , p₂
```

Then the soundness proof is given as a boatload of rule inversion and matching on equalities, to convince Agda that there is no other way to possibly evaluate the compiler output:

```
soundness : ∀{w s}{t : Term s} → Sound′ {w} t (codegen t)
soundness {t = Lit x} (num-e ∷ halt-e) = refl , refl , lit-e
soundness {t = Var x} (pick-e x₁ ∷ halt-e) = refl , refl , var-e x₁
soundness {t = t₁ ⊠ t₂} x
with ⊕-elim {a = codegen t₁ ⊕ codegen t₂} x
... | _ , _ , p , _
with ⊕-elim {a = codegen t₁} p
... | _ , _ , p₁ , p₂
with soundness {t = t₁} p₁ | soundness {t = t₂} p₂
soundness {t = t₁ ⊠ t₂} x
| _ ∷ _ ∷ _ , ._ , _ , times-e ∷ halt-e
| ._ ∷ ._ , ._ , _ , _
| refl , refl , a
| refl , refl , b
= refl , refl , times-e a b
soundness {t = t₁ ⊞ t₂} x
with ⊕-elim {a = codegen t₁ ⊕ codegen t₂} x
... | _ , _ , p , _
with ⊕-elim {a = codegen t₁} p
... | _ , _ , p₁ , p₂
with soundness {t = t₁} p₁ | soundness {t = t₂} p₂
soundness {t = t₁ ⊞ t₂} x
| _ ∷ _ ∷ _ , ._ , _ , plus-e ∷ halt-e
| ._ ∷ ._ , ._ , _ , _
| refl , refl , a
| refl , refl , b
= refl , refl , plus-e a b
soundness {t = Let t₁ In t₂} x
with ⊕-elim {a = codegen t₁} x
... | _ ∷ _ , _ , p₁ , push-e ∷ q
with ⊕-elim {a = codegen t₂} q
... | _ ∷ _ , _ ∷ _ , p₂ , _
with soundness {t = t₁} p₁ | soundness {t = t₂} p₂
soundness {t = Let t₁ In t₂} x
| _ ∷ ._ , ._ , _ , push-e ∷ q
| _ ∷ ._ , ._ ∷ ._ , _ , pop-e ∷ halt-e
| refl , refl , a
| refl , refl , b
= refl , refl , let-e a b
```

Now that we have a verified code generator, as a final flourish we’ll implement a basic compiler frontend^{2} for our language and run it on some basic examples.

We define a surface syntax as follows. In the tradition of all the greatest languages such as BASIC, FORTRAN and COBOL, capital letters are exclusively used, and English words are favoured over symbols because it makes the language readable to non-programmers. I should also acknowledge the definite influence of PHP, Perl and `sh`

on the choice of the `$`

sigil to precede variable names. The sigil `#`

precedes numeric literals as Agda does not allow us to overload them.

```
data Surf : Set where
LET_BE_IN_ : String → Surf → Surf → Surf
_PLUS_ : Surf → Surf → Surf
_TIMES_ : Surf → Surf → Surf
$_ : String → Surf
#_ : ℕ → Surf
infixr 4 LET_BE_IN_
infixl 5 _PLUS_
infixl 6 _TIMES_
infix 7 $_
infix 7 #_
```

Unlike our `Term`

AST, this surface syntax does not include any scope information, uses strings for variable names, and is more likely to be something that would be produced from a parser. In order to compile this language, we must first translate it into our wellformed-by-construction `Term`

type, which necessitates *scope-checking*.

```
open import Data.Maybe
open import Data.Maybe.Categorical
open import Category.Monad
open import Category.Applicative
import Level
open RawMonad (monad {Level.zero})
open import Relation.Nullary
```

```
check : ∀{n} → Vec String n → Surf → Maybe (Term n)
check Γ (LET x BE s IN t) = pure Let_In_ ⊛ check Γ s ⊛ check (x ∷ Γ) t
check Γ (s PLUS t) = pure _⊞_ ⊛ check Γ s ⊛ check Γ t
check Γ (s TIMES t) = pure _⊠_ ⊛ check Γ s ⊛ check Γ t
check Γ (# x) = pure (Lit x)
check Γ ($ x) = pure Var ⊛ find Γ x
where
find : ∀{n} → Vec String n → String → Maybe (Fin n)
find [] s = nothing
find (x ∷ v) s with s ≟ x
... | yes _ = just zero
... | no _ = suc <$> find v s
```

Note that this function is the only one in our development that is partial: it can fail if an undeclared variable is used. For this reason, we use the `Applicative`

instance for `Maybe`

to make the error handling more convenient.

Our compiler function, then, merely composes our checker with our code generator:

```
compiler : Surf → Maybe (SM 0 0 1 0)
compiler s = codegen <$> check [] s
```

Note that we can’t really demonstrate correctness of the scope-checking function, save that if it outputs a `Term`

then there are no scope errors in , as it is impossible to construct a `Term`

with scope errors. One possibility would be to define a semantics for the surface syntax, however this would necessitate a formalisation of substitution and other such unpleasant things. So, we shall gain assurance for this phase of the compiler by embedding some test cases and checking them automatically at compile time.

If we take a simple example, say:

```
example = LET "x" BE # 4
IN LET "y" BE # 5
IN LET "z" BE # 6
IN $ "x" TIMES $ "y" PLUS $ "z"
```

We expect that this program should correspond to the following program:^{3}

```
result : SM 0 0 1 0
result = num 4
∷ push
∷ num 5
∷ push
∷ num 6
∷ push
∷ pick (i 2)
∷ pick (i 1)
∷ times
∷ pick (i 0)
∷ plus
∷ pop
∷ pop
∷ pop
∷ halt
```

We can embed this test case as a type by constructing an equality value — that way, the test will be re-run every time it is type-checked:

```
test-example : compiler example ≡ just result
test-example = refl
```

As this page is only generated when the Agda compiler type checks the code snippets, we know that this test has passed! Hooray!

Working in Agda to verify compilers is a very different experience from that of implementing a certifying compiler in Haskell and Isabelle. In general, the *implementation* of a compiler phase and the *justification of its correctness* are much, much closer together than in Agda than in my previous approach. This allows us to save a lot of effort by deriving programs from their proofs.

Also, dependent types are sophisticated enough to allow arbitrary invariants to be encoded in the structure of terms, which makes it possible, with clever formalisations, to avoid having to discharge trivial proof obligations repeatedly. This is in stark contrast to traditional theorem provers like Isabelle, where irritating proof obligations are the norm, and heavyweight tactics must be used to discharge them en-masse.

My next experiments will be to try and scale this kind of approach up to more realistic languages. I’ll be sure to post again if I find anything interesting.

Bahr, P., 2015. Calculating certified compilers for non-deterministic languages. In R. Hinze & J. Voigtländer, eds. *Mathematics of Program Construction*. Lecture Notes in Computer Science. Springer International Publishing, pp. 159–186.

de Bruijn, N.G., 1972. Lambda Calculus Notation with Nameless Dummies: a Tool for Automatic Formula Manipulation with Application to the Church-Rosser Theorem. *Indagationes Mathematicae (Elsevier)*, 34, pp.381–392.

Hutton, G. and Wright, J., 2004. Compiling exceptions correctly. In *Mathematics of Program Construction*. Springer, pp. 211–227.

McBride, C., 2003. First-Order Unification by Structural Recursion. *Journal of Functional Programming*, 13(6), pp.1061–1075.

Nipkow, T., Paulson, L.C. and Wenzel, M., 2002. *Isabelle/HOL: A Proof Assistant for Higher-Order Logic*, Springer.

O’Connor, L., Keller, G., Murray, T., Klein, G., Rizkallah, C., Chen, Z., Sewell, T., Lim, J., Amani, S., Nagashima, Y. and Hixon, A., 2016. Cogent: Certified compilation for a Functional Systems Language. *Submitted to POPL. Currently under review*.

Walker, D., 2005. Substructural Type Systems. In B. C. Pierce, ed. *Advanced Topics in Types and Programming Languages*. MIT Press.

In existing large-scale verified software artifacts like seL4 however, we still use *C* as our implementation language^{2}, despite the fact that it is positively hellish to reason about [Koenig 1988]. The reasons for this are numerous, but there are two main ones. The first concern, and the least important, is that most purely functional language implementations, and certainly all total ones, depend on a runtime, which would enlarge the trusted computing base, and compromise the system’s efficiency. The second and more pressing concern is that for low-level systems like microkernels, or even services such as drivers and file system servers, are forced to confront the reality of the von Neumann architecture. Sometimes they might need to manipulate some bytes in memory, or perform potentially unsafe pointer arithmetic. If we are to follow traditional systems orthodoxy, they simply cannot be efficiently expressed at the level of abstraction of say, Haskell.

This has meant that these systems are forced to choose an implementation language which requires no runtime support and which supports all of these unsafe features. Sadly, traditional “systems” languages such as C, while satisfying this criteria, will always extract their pound of flesh when it comes to verification. The huge cost to verifiability that comes with allowing divergence and unsafe memory access and so on is not just paid where those semantic features are used, but *everywhere in the program*. The majority of the seL4 proofs are concerned with re-establishing refinement invariants between a state-monadic executable specification and the C implementation. This specification is semantically equivalent, more or less, to the C implementation, but the proof is huge. The majority of obligations are about things like pointer validity, heap validity, type safety, loop termination and so on — things that we don’t have to worry about in total, pure languages.

My research project is focused on reducing the cost of verification by replacing the systems implementation language with one that has a straightforward denotational semantics, about which correctness properties can be easily stated. This language is under a number of constraints: It can’t rely on a runtime, and it must have minimal performance overhead compared to a manually written C implementation. Furthermore, the compilation needs to be highly trustworthy.

The language Cogent, which we will submit to POPL this year, is essentially a linear lambda calculus, with a bunch of usability features, such as pattern matching, records and a simple C FFI. The use of linear types eliminates the need for a garbage collector, and allows for efficient implementation of a purely functional semantics using destructive updates [Wadler 1990]. Indeed, two semantics are ascribed to the language: One which resembles any pure functional language (a denotation in terms of sets and total functions), and one which is of a much more imperative flavour (with a state monadic denotation). We have proven that the imperative semantics is a refinement of the pure semantics for any well-typed program, and the compiler *co-generates* an efficient C implementation and a proof, in Isabelle/HOL [Nipkow et al. 2002], that this imperative semantics is an abstraction of the C code it generates.

To verify properties about programs written in this language, it suffices to use the simple equational reasoning tactics used for any HOL program. Hoare logic and stateful reasoning are gone, and high-level properties are generally stated as equations between terms. The snag, however, is in the C FFI. As the language is so restrictive (total, safe, and with no indirectly observable mutation), the C FFI is used heavily to provide everything from red-black trees to loop iterators. While Cogent lets us reason nicely about code written in it, the moment a C function is used it produces a number of thorny proof obligations, essentially requiring us to show that, at least to an outside observer, the C code has not violated the invariants assumed by Cogent’s type system.

We were able to express the vast majority of an Ext2 file system implementation in Cogent, and the verification of file systems written in Cogent is certainly easier than a raw C implementation. However, there are a number of places in the file system implementation where efficiency is sacrificed in order to be able to express the file system in a safe way. For example, deserialising structures from a disk buffer into file system structures is done byte-by-byte, rather than by a memory map.

The flaw in Cogent’s approach is that it’s all-or-nothing. If a program can’t be expressed in this highly restrictive language, it must be expressed in C, and then all of Cogent’s verification advantages are lost.

To remedy this, I first designed a *spectrum* of languages, each differing primarily in their treatment of memory.

The language is more or less Cogent as it exists now. Linear types allow for the extremely simple semantics that make verification much easier. The language is less restrictive, by doing away with linear types and bringing explicit references to the language. This introduces manual memory management, and stateful reasoning, however the memory model remains fairly high level. It is possible to leak memory or invalidate pointers in , unlike , but the lack of linear types now permits programs that rely on sharing mutable pointers, such as efficient data structure implementations. The lowest level language is also stateful, but the state consists of a heap of bytes. Here, pointer arithmetic is permitted, as well as arbitrary casting and reinterpretation of memory, more or less on the same semantic level as C.

Clearly, the compiler for can simply compile to , and to . Once in , it is straightforward to emit implementation code in LLVM IR or C. The advantage of this spectrum is that we can allow the programmer access to *every* level of abstraction. I plan to achieve this by allowing code from to be written *inline* inside , and both of these in turn inside , in the vein of the inline assembly language supported by most C compilers. The crucial point here is that each of these inline blocks will generate a *proof obligation* to ensure that, externally, the inline block is “well-behaved” with respect to the abstractions of the higher level language. For example, embedding inside generates the obligation that any valid pointers left by the program point to data of the right type. Embedding inside requires showing that all available pointers are valid, no memory was leaked, and that there is at most one pointer to each object remaining. I am exploring the possibility of making each of these languages embedded inside a dependently-typed programming language such as Agda or Idris, to allow the proofs to be expressed along with the program.

These different languages are all total. Trying to fit recursion or loops into our spectrum leads to some unfortunate consequences. Putting them on the “Liberty” side would mean throwing away our nice memory model whenever we want to write a loop, and putting it to the left of on the “Safety” side would mean that every time we step down to a lower level of memory abstraction we would also be obliged to prove termination. So, rather than a *spectrum*, we have a *lattice* of languages, with non-total languages running parallel to the total ones:

The compiler moves from towards , and to embed a language from the upper-left corner inside a language towards the lower-right corner requires a proof.

I have yet to completely clarify exactly what these languages would look like. It’s my hope that they can share a large amount of syntax, to avoid confusing programmers. Once I work out the details, I suspect this approach will allow programmers to implement systems in a reasonably high-level way, but breaking some language-based abstractions when necessary, where the only costs to verifiability come directly from these points were the abstraction is broken.

Koenig, A., 1988. *C traps and pitfalls*, Pearson Education India.

*Isabelle/HOL: A Proof Assistant for Higher-Order Logic*, Springer.

Turner, D., 2004. Total functional programming. *Journal of Universal Computer Science*, 10, pp.187–209.

Wadler, P., 1990. Linear types can change the world! In M. Broy & C. Jones, eds. *IFIP Tc 2 Working Conference on Programming Concepts and Methods, Sea of Galilee, Israel*. North Holland Publishing, pp. 347–359.

On the 23rd of April 2014, the first ever live theorem-proving competition was held at FP-SYD, with predictably great results.

The six contestants, fueled by nothing by alcoholic beverages and armed only with their favorite Coq environment and a handful of basic tactics, set out to prove a set of challenging lemmas in elimination rounds of five minutes or less.

Commentary/sledging was provided helpfully by our commentary team, including Boey Maun Suang, Thomas Sewell and Erik de Castro Lopo (as well as our contestants on the sidelines).

The final rankings were:

- Amos Robinson
- Liam O’Connor (me!)
- Tony Sloane
- Dom De Re and Eric Willigers (tied)
- Ben Lippmeier

Amos took home the chicken trophy, and a good time was had by all. It’s a remarkably fun event, both for those watching (many of whom were not familiar with Coq or theorem proving), and for those participating (I found it quite exhilirating!).

Many thanks go to Ben Lippmeier who organised basically everything and calibrated each lemma for difficulty, and to Atlassian for hosting FP-SYD.

]]>