I have had considerable success at ICFP2016 and co-located events. I will be presenting two papers: one at ICFP, on the Cogent project; and one at TyDe, about my work on proof-automation combinators in Agda. I also co-authored a two page paper that was accepted to the ML workshop, also co-located with ICFP, but I will not be presenting the paper. Details of all three papers are presented below.

Liam O’Connor, Zilin Chen, Christine Rizkallah, Sidney Amani, Japheth Lim, Toby Murray, Yutaka Nagashima, Thomas Sewell, Gerwin Klein

**Refinement through Restraint: Bringing Down the Cost of Verification**

*International Conference on Functional Programming (ICFP)*, Nara, Japan, September 2016.

Available from CSIRO

We present a framework aimed at significantly reducing the cost of verifying certain classes of systems software, such as file systems. Our framework allows for equational reasoning about systems code written in our new language, Cogent. Cogent is a restricted, polymorphic, higher-order, and purely functional language with linear types and without the need for a trusted runtime or garbage collector. Linear types allow us to assign two semantics to the language: one imperative, suitable for efficient C code generation; and one functional, suitable for equational reasoning and verification. As Cogent is a restricted language, it is designed to easily interoperate with existing C functions and to connect to existing C verification frameworks. Our framework is based on certifying compilation: For a well-typed Cogent program, our compiler produces C code, a high-level shallow embedding of its semantics in Isabelle/HOL, and a proof that the C code correctly refines this embedding. Thus one can reason about the full semantics of real-world systems code productively and equationally, while retaining the interoperability and leanness of C. The compiler certificate is a series of language-level proofs and per-program translation validation phases, combined into one coherent top-level theorem in Isabelle/HOL.

Liam O’Connor

**Applications of Applicative Proof Search**

*Workshop on Type-Driven Development*, Nara, Japan, September, 2016.

In this paper, we develop a library of typed proof search procedures, and demonstrate their remarkable utility as a mechanism for proof-search and automation. We describe a framework for describing proof-search procedures in Agda, with a library of tactical combinators based on applicative functors. This framework is very general, so we demonstrate the approach with two common applications from the field of software verification: a library for property-based testing in the style of SmallCheck, and the embedding of a basic model checker inside our framework, which we use to verify the correctness of common concurrency algorithms.

Yutaka Nagashima and Liam O’Connor

**Close Encounters of the Higher Kind: Emulating Constructor Classes in Standard ML**

*Workshop on ML 2016*, Nara, Japan, September, 2016.

Available from CSIRO

]]>We implement a library for encoding constructor classes in Standard ML, including elaboration from minimal definitions, and automatic instantiation of superclasses.

The language I have been working on for the last couple of years with NICTA/Data61, called Cogent, was recently accepted to ICFP 2016, topping off a string of publications at ITP 2016 and ASPLOS 2016.

The ICFP paper, entitled “Refinement Through Restraint: Bringing Down the Cost of Verification”, outlines our approach to the language design, the implementation of its certifying compiler, and the various refinement proofs that it generates.

The paper will be available from NICTA after publication.

We present a framework aimed at significantly reducing the cost of verifying certain classes of systems software, such as file systems. Our framework allows for equational reasoning about systems code written in our new language, Cogent. Cogent is a restricted, polymorphic, higher-order, and purely functional language with linear types and without the need for a trusted runtime or garbage collector. Linear types allow us to assign two semantics to the language: one imperative, suitable for efficient C code generation; and one functional, suitable for equational reasoning and verification. As Cogent is a restricted language, it is designed to easily interoperate with existing C functions and to connect to existing C verification frameworks. Our framework is based on certifying compilation: For a well-typed Cogent program, our compiler produces C code, a high-level shallow embedding of its semantics in Isabelle/HOL, and a proof that the C code correctly refines this embedding. Thus one can reason about the full semantics of real-world systems code productively and equationally, while retaining the interoperability and leanness of C. The compiler certificate is a series of language-level proofs and per-program translation validation phases, combined into one coherent top-level theorem in Isabelle/HOL.

We also were successful at ITP 2016 with our paper “A Framework for the Automatic Formal Verification of Refinement from Cogent to C”, where we present the details of our low-level translation validation refinement framework. This paper is also available from NICTA.

Our language Cogent simplifies verification of systems software using a certifying compiler, which produces a proof that the generated C code is a refinement of the original Cogent program. Despite the fact that Cogent itself contains a number of refinement layers, the semantic gap between even the lowest level of Cogent semantics and the generated C code remains large. In this paper we close this gap with an automated refinement framework which validates the compiler’s code generation phase. This framework makes use of existing C verification tools and introduces a new technique to relate the type systems of Cogent and C.

Lastly, Gernot Heiser recently presented (on behalf of Sidney Amani) our systems paper at ASPLOS 2016, where we outline the implementation, performance, and verification of two case study file systems using Cogent. This paper is available from NICTA here.

We present an approach to writing and formally verifying high-assurance file-system code in a restricted language called Cogent, supported by a certifying compiler that produces C code, high-level specification of Cogent, and translation correctness proofs. The language is strongly typed and guarantees absence of a number of common file system implementation errors. We show how verification effort is drastically reduced for proving higher-level properties of the file system implementation by reasoning about the generated formal specification rather than its low-level C code. We use the framework to write two Linux file systems, and compare their performance with their native C implementations.

This concludes a string of publications for our project in the three main fields it intersects: programming languages (ICFP), systems (ASPLOS) and formal methods (ITP). Congratulations are due to everyone on who contributed.

]]>Work hard, boy, and you’ll find,

One day you’ll have

A blog like mine, blog like mine,

A blog like mine.

— Cat Stevens [Paraphrased]

Over the years many people have asked me to share with them how I developed this website, seeing as it has a variety of very nice features including:

Embedded LaTeX math actually rendered with LaTeX and embedded in the document. This means I can use LaTeX packages like

`tikz`

and`xypic`

for diagrams, as well as well typeset formulas inline with other text with baseline correction. An obligatory demonstration: For inline math, , and for display math:Literate Agda support, such as in this article, with semantic highlighting and hyperlinked jump-to-definition, even for the standard library imports.

Citeproc based bibliography support, which reads BibTeX databases. Most of my technical articles make use of this, for example this article, and the article linked above.

Atom-based feed support (see the little button at the bottom of this page)

Syntax highlighting for most other languages, such as Haskell, visible in the same article linked above.

Up until recently, I was very reluctant to share the code for this website, because, among other things:

LaTeX rendering was accomplished through the use of an ancient, barely working piece of C code I cribbed from the GladTeX installation. It was very system dependent - it required

`latex`

,`dvips`

and`gs`

to be in exactly the right places to work correctly. The formulae were then embedded in the HTML document by**postprocessing**the output of Pandoc with`tagsoup`

, and detecting any equations in a very ad-hoc way. They were cached using a liberal mix of`unsafePerformIO`

and hacks, which barely worked and typically caused problems if the LaTeX failed to compile.The bibliography support relied on a hacky workaround I had written to provide

`Binary`

instances for an unexposed part of`pandoc-citeproc`

.The site would mysteriously corrupt its own Hakyll cache for no reason.

The Literate Agda compiler was hardcoded against an old version of Agda and was pretty unmaintainable.

However I am pleased to announce that now I have cleaned up all of these problems. The LaTeX rendering has been completely rewritten in Haskell, and is more configurable and useful. It’s provided in my `latex-formulae`

suite of packages, available on GitHub and Hackage under the following three packages:

`latex-formulae-image`

- Basic support for rendering LaTeX formulae to images using actual system LaTeX.`latex-formulae-pandoc`

- Support for rendering LaTeX equations inside pandoc documents, and a standalone executable that can be used as a Pandoc filter.`latex-formulae-hakyll`

- A Hakyll compiler that uses`latex-formulae-pandoc`

, and adds some useful caching for Hakyll`watch`

servers.

The Agda code has been cleaned up and packaged into two useful Hackage packages, also available on GitHub:

`agda-snippets`

- Support for preprocessing*any text*document and replacing Agda code blocks with coloured, hyperlinked source in HTML. It leaves the rest of the text untouched, so it’s possible to then read the output with Pandoc.`agda-snippets-hakyll`

- Adds some basic mechanisms to integrate`agda-snippets`

with Hakyll and Pandoc.

The bibliography workaround is no longer necessary, due to fixes in upstream libraries.

The other features (feeds, highlighting) are all just provided by Hakyll built-in. The source of this website itself is also available on GitHub, in particular the Hakyll code for it may be of interest.

]]>Recently I released the Haskell library patches-vector, a library for simple, but theoretically-sound manipulation of *patches*, or diffs, on a “document”, which in this case just consists of a `Vector`

of arbitrary elements.

I approached the development of this library from a formal perspective, devising laws for all the operations and rigorously checking them with QuickCheck. I defined a patch as a series of *edits* to a document, where an *edit* is simply an insertion, deletion, or replacement of a particular vector element.

```
newtype Patch a = Patch [Edit a] deriving (Eq)
data Edit a = Insert Int a
| Delete Int a
| Replace Int a a
deriving (Show, Read, Eq)
```

We have a function, `apply`

, that takes a patch and applies it to a document:

`apply :: Patch a -> Vector a -> Vector a`

Patches may be structurally different, but accomplish the same thing. For example, a patch that consists of `Delete`

and an `Insert`

may extensionally be equivalent to a patch that does a single `Replace`

, but they are structurally different. To simplify the mathematical presentation here, we define an equivalence relation that captures this *extensional* patch-equivalence:

To define further operations, we must first note that patches and documents form a *category*. A *category* is made up of a class of *objects* (in this case documents), and a class of arrows or *morphisms* (in this case patches). For each object , there must be an *identity morphism* , and for each pair of morphisms and there must be a composed morphism . They must satisfy the following laws:

Left-identity: for any morphism ,

Right-identity: for any morphism ,

Associativity: for any three morphisms , , ,

The category laws laws comprise the first part of our specification. Translating it into Haskell, I made `Patch a`

an instance of `Monoid`

, just for convenience, even though the composition operator is not defined for any arbitrary patches, and therefore patches technically are not a monoid in the algebraic sense.

Then, the above laws become the following QuickCheck properties:

```
forAll (patchesFrom d) $ \a -> a <> mempty == a
forAll (patchesFrom d) $ \a -> mempty <> a == a
forAll (historyFrom d 3) $ \[a, b, c] ->
apply (a <> (b <> c)) d == apply ((a <> b) <> c) d
```

Here, `patchesFrom d`

is a generator of patches with domain document `d`

, and `historyFrom d 3`

produces a sequence of patches, one after the other, starting at `d`

.

In the case of patches and documents, they form what’s called *indiscrete category* or a *chaotic category*, as there exists a single, unique^{1} patch between any two documents. A function to *find* that patch is simply `diff`

, which takes two documents and computes the patch between them, using the Wagner-Fischer algorithm [Wagner and Fischer 1974].

`diff :: Eq a => Vector a -> Vector a -> Patch a`

It’s easy to come up with correctness properties for such a function, just by examining its interaction with the identity patch, the composition operator, and `apply`

:

```
apply (diff d e) d == e
diff d d == mempty
apply (diff a b <> diff b c) a == apply (diff a c) a
```

As there exists a patch between any two documents, it follows that for every patch there exists an *inverse patch* such that and . We define a function, `inverse`

, in Haskell:

`inverse :: Patch a -> Patch a`

And we can check all the usual properties of inverses:

```
forAll (patchesFrom d) $ \p -> p <> inverse p == mempty
forAll (patchesFrom d) $ \p -> inverse p <> p == mempty
forAll (patchesFrom d) $ \p -> inverse (inverse p) == p
forAll (patchesFrom d) $ \p -> inverse mempty == mempty
forAll (historyFrom d 2) $ \[p, q] ->
inverse (p <> q) == inverse q <> inverse p
```

We can also verify that the inverse patch is the same that we could have found by `diff`

:

`apply (diff d e) d == apply (inverse (diff e d)) d`

A category that contains inverses for all morphisms is called a *groupoid*. All indiscrete categories (such as patches) are groupoids, as the inverse morphism is guaranteed to exist. Groupoids are very common, and can also be thought of as a group^{2} with a partial composition relation, but I find the categorical presentation cleaner.

So, we have now specified how to compute the unique patch between any two documents (`diff`

), how to squash patches together into a single patch (composition), how to apply patches to a document (`apply`

), and how to compute the inverse of a given patch (`inverse`

). The only thing we’re missing is the crown jewel of patch theory, how to *merge* patches when they diverge.

I came to patch theory from concurrency control research, and not via the patch theory of Darcs [Jacobson 2009], so there are some differences in how I approached this problem compared to how Darcs does.

In their seminal paper [Ellis and Gibbs 1989] on the topic, Ellis and Gibbs define a function that, given a diverging pair of patches and , will produce new patches and , such that the result of and is the same:

They called this approach *operational transformation*, but category theory has a shorter name for it: a *pushout*. A *pushout* of two morphisms and consists of an object and two morphisms and such that . The pushout must also be *universal*, but as our category is indiscrete we know that this is the case without having to do anything.

We can use this pushout, which we call `transform`

, as a way to implement merges. Assuming a document history and an incoming patch from version , where , we can simply `transform`

the input patch against the composition of all the patches , resulting in a new patch that can be applied to the latest document .

Note that just specifying the `transform`

function to be a pushout isn’t quite sufficient: It would be perfectly possible to resolve two diverging patches and by using patches for and for , and they would resolve to the same document, but probably wouldn’t be what the user intended.

Instead, our `transform`

function will attempt to incorporate the changes of into and the changes of into , up to merge conflicts, which can be handled by a function passed in as a parameter to `transform`

:

`transformWith :: (a -> a -> a) -> (Patch a, Patch a) -> (Patch a, Patch a)`

Then we can add the pushout property as part of our QuickCheck specification:

```
forAll (divergingPatchesFrom d) $ \(p,q) ->
let (p', q') = transformWith const p q
in apply (p <> q') d == apply (q <> p') d
```

If the merge handler is commutative, then so is `transformWith`

:

```
forAll (divergingPatchesFrom d) $ \(p,q) ->
let (p' , q' ) = transformWith (*) p q
(q'', p'') = transformWith (*) q p
in p' == p''
&& q' == q''
```

We can also ensure that `transformWith`

keeps the intention of the input patches by using as one of the diverging patches:

```
forAll (patchesFrom d) $ \ p ->
transformWith (*) mempty p == (mempty, p)
forAll (patchesFrom d) $ \ p ->
transformWith (*) p mempty == (p, mempty)
```

And with that, we’ve specified `patches-vector`

. A patch theory is “just” a small, indiscrete groupoid with pushouts^{3}. We can theoretically account for all the usual patch operations: inversion, composition, merging, `diff`

, and `apply`

, and this gives rise to a spec that is rock solid and machine-checked by QuickCheck.

The full code is available on GitHub and Hackage. Please do try it out!

I also wrote a library, `composition-tree`

(also on Hackage and GitHub), which is similarly thoroughly specified, and is a convenient way to store a series of patches in a sequence, with good asymptotics for things like taking the `mconcat`

of a sublist. I use these two libraries together with `pandoc`

, `acid-state`

and `servant`

to make a basic wiki system with excellent support for concurrent edits, and edits to arbitrary versions. The wiki system is called `dixi`

(also on GitHub and Hackage).

I independently invented this particular flavour of patch theory, but it’s extremely similar to, for example, the patch theory underlying the pijul version control system [see Mimram and Di Giusto 2013], which also uses pushouts to model merges.

Another paper that is of interest is the recent work encoding patch theory inside Homotopy Type Theory using Higher Inductive Types [Angiuli et al. 2014]. HoTT is typically given semantics by ∞-groupoids, so it makes sense that patches would have a natural encoding, but I haven’t read that paper yet.

Also, another paper [Swierstra and Löh 2014] uses separation logic to describe the semantics of version control, which is another interesting take on patch theoretic concepts.

Angiuli, C., Morehouse, E., Licata, D.R. and Harper, R., 2014. Homotopical patch theory. In *Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming*. ICFP ’14. New York, NY, USA: ACM, pp. 243–256.

Ellis, C.A. and Gibbs, S.J., 1989. Concurrency control in groupware systems. In *Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data*. SIGMOD ’89. New York, NY, USA: ACM, pp. 399–407.

Jacobson, J., 2009. *A formalization of darcs patch theory using inverse semigroups*, UCLA Computational and Applied Mathematics.

Mimram, S. and Di Giusto, C., 2013. A categorical theory of patches. *Electronic Notes in Theoretical Computer Science*, 298, pp.283–307.

Swierstra, W. and Löh, A., 2014. The semantics of version control. In *Proceedings of the 2014 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software*. Onward! 2014. New York, NY, USA: ACM, pp. 43–54.

Wagner, R.A. and Fischer, M.J., 1974. The string-to-string correction problem. *J. ACM*, 21(1), pp.168–173.

The main direction of my research thus far has been to exploit linear types to make purely functional, equational reasoning possible for systems programming. I’ve talked about the difficulty of verifying imperative programs before, but my past entries about linear types did not discuss the advantages that they can bring to such a domain.

Chiefly, they allow us to reason about our program as if all data structures are immutable, with all of the benefits that implies, while the actual implementation performs efficient destructive updates to mutable data structures. This is achieved simply by statically ruling out every program where the difference between the immutable and the mutable interpretations can be observed, by requiring that every variable that refers, directly or indirectly, to a heap data structure, must be used exactly once. As variables cannot be used multiple times, this implies that for any allocated heap object, there is exactly one live, usable reference that exists at any one time. This is called the *uniqueness* property of linear types.

This is a very simple restriction, but it proves a considerable burden when trying to actually write programs. For example, a naïve definition of a linear array would become unusable after just one element was accessed! Other data structures, with complex reference layouts that involve multiple aliasing references and sharing, simply cannot be expressed.

For this reason, when designing the linear systems language Cogent for my research, I allowed parts of the program to be written in unsafe, imperative C, and those C snippets are able to manipulate opaque types that are *abstract* in the purely functional portion. The author of the code would then have to prove that the C code doesn’t do anything too unsafe, that would violate the invariants of the linear type system.

Specifically, Cogent extends the (dynamic) typing relation for values to include *sets of locations* which can be accessed from a value^{1}. For example, the typing rule of for tuple values is:

Observe how we have used these pointer sets to enforce that there is no internal aliasing in the structure. It also gives us the information necessary to precisely state the conditions under which a C program is safe to execute. We define *stores*, denoted , as a partial mapping from a location or pointer to a value .

Assuming a C-implemented function is evaluated with an input value and an input store , the return value and output store must satisfy the following three properties for all locations :

**Leak freedom**- , that is any input reference that wasn’t returned was freed.**Fresh allocation**- , that is every new output reference, not in the input, was allocated in prevously-free space.**Inertia**- , that is, every reference not in either the input or the output of the function has not been touched in any way.

Assuming these three things, it’s possible to show that the two semantic interpretations of linear typed programs are equivalent, even if they depend on unsafe, imperative C code. I called these three conditions together the *frame conditions*, named after the *frame problem*, from the field of knowledge representation. The frame problem is a common issue that comes up in many formalisations of stateful processes. Specifically, it refers to the difficulty of *local reasoning* for many of these formalisations. The state or store is typically represented (as in our Cogent formalisation above) as a large, monolithic blob. Therefore, whenever any part of the state is updated, every invariant about the state must be re-established, even if it has nothing to do with the part of the state that was updated. The above conditions allow us to state that the C program does not affect any part of the state except those it is permitted (by virtue of the linear references it recieved) to modify, thus allowing us to enforce the type system invariants across the whole program.

Presenting such proof obligations in terms of stores and references as described above, however, is extremely tedious and difficult to work with when formally reasoning about imperative programs, particularly if the invariants we are trying to show are initially broken and only later re-established. Typically, imperative programs lend themselves to axiomatic semantics for verification, the most obvious example being Hoare Logic [Hoare 1969], which provides a proof calculus for a judgement written , which states that, assuming the initial state (which maps variables to values) satisfies an assertion , then the resultant state of running on , satisfies .

When our assertions involve references and aliasing, however, Hoare Logic doesn’t buy us much over just reasoning about the operational semantics directly. A variety of ad-hoc operators have to be added to the logic, for example to say that references do not alias, references point to free space, or that references point to valid values. To make this cleaner, we turn instead to the *Separation Logic* [Reynolds 2002]. Separation logic is a variant of Hoare Logic that is specifically designed to accommodate programming with references and aliasing. It augments the state of Hoare Logic with a mutable store , and the following additional assertions:

- A special assertion , which states that the store is empty, i.e if and only if .
- A binary operator , which states that the store is defined at
*exactly one*location, i.e if and only if . - A
*separating conjunction*connective , which says that the store can be split into two disjoint parts and where and . - A
*separating implication*connective , which says that extending the store with a disjoint part that satisfies results in a store that satisfies .

Crucially, Separation Logic includes the *frame rule*, its own solution to the frame problem, where an unrelated assertion can be added to both the pre- and the post-condition of a given program in a separating conjunction:

This allows much the same local reasoning that we desired before: The program can be verified to work for a store that satisfies , but otherwise contains *no other values*. Then that program may be freely used with a *larger* state and we automatically learn, from the frame rule, that any unrelated bit of state cannot affect, and is not affected by the program .

Separation logic makes expressing these obligations substantially simpler. For example, given a program with an input pointers and and output pointers , we can express all three frame conditions as a single triple:

Here is a sketch of a proof that this implies the frame conditions listed above. Assume an input store . Split into disjoint stores and such that . Let the output store of running with be . Note that by the triple above, we have that .

We have by the frame rule that the output of running with the full store is where .

**Leak freedom**- For any arbitrary location , if but then we must show that . As , we know from that and, as they are disjoint, . Therefore, the only way for to be true is if , but as from , we can conclude that .**Fresh allocation**- If but then we must show that . We have from that , and hence . As they are disjoint, so the only way for to be true is if . But, as we know that from and , we can conclude that .**Inertia**- If and , then we can conclude from that and from that . If , then , thanks to the frame rule as shown above. If , then and and therefore we can say that as they’re both undefined.

I think this is a much cleaner and easier way to state the frame conditions.

My next item to investigate is how I might integrate this into a seamless language and verification framework. My current thinking is to take a lambda calculus with linear types and refinement types, and augment it with an imperative embedded language, which allows several of the guarantees of the linear type system to be suspended. The imperative embedded language might resemble the Hoare-state monad [Swierstra 2009], only using Separation Logic rather than Hoare Logic, but I am still figuring out all the details.

Hoare, C.A.R., 1969. An axiomatic basis for computer programming. *Communications of the ACM*, 12(10), pp.576–580.

Reynolds, J.C., 2002. Separation logic: A logic for shared mutable data structures. In *Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science*. LICS ’02. Washington, DC, USA: IEEE Computer Society, pp. 55–74.

Swierstra, W., 2009. A hoare logic for the state monad. In *Theorem Proving in Higher Order Logics*. Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 440–451.

The real formalisation is a bit more complicated, allowing nonlinear

*read-only*pointers as well as linear, writable ones.↩

```
{-# OPTIONS --type-in-type #-}
module 2015-09-10-girards-paradox where
open import Data.Empty
open import Data.Unit
open import Data.Bool
open import Relation.Binary.PropositionalEquality
open import Data.Product
```

Axiomatic set theories such as that of Zermelo and Fraenkel, in their attempt to provide a comprehensive foundation for mathematics, involve several intricate tricks to avoid becoming inconsistent. A suitably naïve set theory is already inconsistent due to the infamous paradoxical set of Russell [1938].

Here we have used *set comprehension* to define as the set of all sets that do not contain themselves. This leads to the question, *does contain *? If is an element of , then it is not, as only contains sets that do not contain themselves. If is not an element of , then it is, as does not contain itself — We have a paradox!

To address this, different foundations take different approaches. Most axiomatic set theories eliminate or restrict the *rule of comprehension*, that is, they don’t allow sets to be constructed from arbitrary predicates. Instead, set comprehension can only be used to describe subsets of already constructed sets. This prevents comprehension from being used above, but it also prevents a lot of other useful constructions, like products or unions! Thus a handful of other axioms to construct sets are added, such as pairing, union, powerset and so on (all nicely explained in Halmos [1960]).

Another axiom, that of *regularity*, says^{1} that there is no infinite sequence such that, for any , . This implies that no set can contain itself, and allows us to build the universe of set theory by *stages*, called *ranks*. At rank zero, no sets exist; at rank one, there is just the empty set; at rank two, there is also the set containing the empty set; and at each following rank, the added sets all contain the sets that are defined at earlier ranks, as shown in the following figure:

The entire universe of set theory can be thought of as the union of the universe at each rank, , a presentation originally due to Zermelo [1930], but commonly attributed to John von Neumann.

This stratification bears remarkable similarity to Russell’s theory of *types* (see Russell [1938]), his own solution to the the paradoxical set , and the distant ancestor of modern type theory.

Indeed, in the intuitionistic type theory of Martin-Löf [1984], the approximate foundation of the Agda proof assistant, we have a heirarchy of types that very much resembles that of von Neumann or Zermelo^{2}:

The rule of *cumulativity*, which is not present in Agda^{3}, but exists in some type theories and languages such as Idris, makes this resemblance even stronger:

This rule implies that like the von Neumann rank , a type is inhabited by every type where .

The differences between the two theories start to emerge when one examines *why* this stratification exists in type theory. In axiomatic set theory, eliminating the axiom of regularity and thus the stratification it implies makes it rather difficult to do induction, but it does not make the theory inconsistent — there have been several *non-well-founded set theories* proposed, such the hyperset theory of Aczel [1988], which do exactly this.

Removing unrestricted set comprehension is enough to avoid Russell’s paradox, as it allows us to distinguish between *formulae* (or predicates) and *sets*. Unlike informal set theory, we cannot construct a set for any given formula. For example, is a valid formula, but is *not* a set.

Type theories are not set theories — they do not have a separate logical formula language, like that of Frege, to serve as a basis for the theory. So, one cannot achieve consistency in type theory by restricting how a set may be constructed from a logical formula. Instead, type theory places restrictions on the kinds of formulae that can be expressed. Rather that rule out paradoxical *sets* representing self-referential propositions, type theory rules out *the propositions themselves*. In such a theory, it is not even well-formed to ask if a set contains itself^{4}.

This restriction is a consequence of the hierarchy mentioned earlier — remove this from type theory, by saying instead that , and the result is more or less equivalent to Falso^{5}. We can show that type theory is inconsistent with this change using Girard’s paradox, which is a generalised encoding of Russell’s paradox for pure type systems. The contradiction derived from this paradox is rather involved, so much so that Martin-Löf himself didn’t realise that it applied to the first version of his type theory. Hurkens [1995] provided a simplification, which is encoded in Agda here.

With inductive types, however, we can use Russell’s paradox directly, by formalising a naïve notion of sets as comprehensions, and using this to derive a contradiction.

For these (interactive) Agda snippets, I have enabled `--type-in-type`

, which removes the predicative heirarchy from the type theory, instead stating that .

```
data SET : Set where
set : (X : Set) → (X → SET) → SET
```

This defines a set (written `set X f`

) as a comprehension over an *carrier type* `X`

and a function `f`

, where the element for a given index value `x : X`

is given by `f x`

. This definition is already using the fact that — normally, a type (of type ) would not be permitted as a parameter to `set`

, which constructs a type of the same size .

The empty set, having no elements, uses the empty type as its carrier :

```
∅ : SET
∅ = set ⊥ ⊥-elim
```

The set containing the empty set, having one element, uses the unit type as its carrier:

```
⟨∅⟩ : SET
⟨∅⟩ = set ⊤ (λ _ → ∅)
```

The next rank, , has two elements, and thus can use `Bool`

as its carrier:

```
⟨∅,⟨∅⟩⟩ : SET
⟨∅,⟨∅⟩⟩ = set Bool (λ x → if x then ∅ else ⟨∅⟩)
```

More sets could be defined using similar techniques, so I will forgo any further definitions.

We can also define the membership operators for our `SET`

type:

```
_∈_ : SET → SET → Set
a ∈ set X f = Σ X (λ x → a ≡ f x)
_∉_ : SET → SET → Set
a ∉ b = (a ∈ b) → ⊥
```

A value of type `a ∈ set X f`

can be thought of as a proof that there exists a value `x : X`

for which the element function `f`

gives `a`

.

Using these operators, we can define Russell’s paradoxical set as follows:

```
Δ : SET
Δ = set (Σ SET (λ s → s ∉ s)) proj₁
```

This is a set which, for its carrier type, uses *pairs* containing a set `s`

and a proof that `s`

does not contain itself. The element function just discards the proof, leaving us with the `SET`

of all `SET`

s that do not contain themselves.

Indeed, we can prove that every set which is in `Δ`

does not contain itself:

```
x∈Δ→x∉x : ∀ {X} → X ∈ Δ → X ∉ X
x∈Δ→x∉x ((Y , Y∉Y) , refl) = Y∉Y
```

A corollary of this is that `Δ`

itself does not contain itself:

```
Δ∉Δ : Δ ∉ Δ
Δ∉Δ Δ∈Δ = x∈Δ→x∉x Δ∈Δ Δ∈Δ
```

But we know that every set which does not contain itself is in `Δ`

:

```
x∉x→x∈Δ : ∀ {X} → X ∉ X → X ∈ Δ
x∉x→x∈Δ {X} X∉X = (X , X∉X) , refl
```

And from this we can derive a contradiction:

```
falso : ⊥
falso = Δ∉Δ (x∉x→x∈Δ Δ∉Δ)
```

I find it very curious that two very different approaches to formalising mathematics end up with much the same stratified character, and for different reasons. Perhaps this Russell-style heirarchy is, kind of like the Church-Turing thesis, a fundamental characteristic of any sufficiently expressive foundation. Something *discovered* rather than *invented*. In the words of Scott [1974]:

The truth is that there is only one satisfactory way of avoiding the paradoxes: namely, the use of some form of the

theory of types. That was at the basis of both Russell’s and Zermelo’s intuitions. Indeed the best way to regard Zermelo’s theory is as a simplification and extension of Russell’s. (We mean Russell’ssimpletheory of types, of course.) The simplification was to make the typescumulative. Thus mixing of types is easier and annoying repetitions are avoided. Once the later types are allowed to accumulate the earlier ones, we can then easily imagineextendingthe types into the transfinite — just how far we want to go must necessarily be left open. Now Russell made his typesexplicitin his notation and Zermelo left themimplicit. [emphasis in original]

The Agda development in this post is taken from one of Thorsten Altenkirch’s lectures, the code of which is available here. The original proof is, as far as I can tell, due to Chad E Brown, who formulated the same thing in Coq.

Aczel, P., 1988. *Non-well-founded sets*, Center for the Study of Language; Information, Stanford University.

Halmos, P., 1960. *Naïve Set Theory*, Van Nostrand.

Hurkens, A.J.C., 1995. A simplification of Girard’s Paradox. In M. Dezani-Ciancaglini & G. Plotkin, eds. *Typed Lambda Calculi and Applications: Proceedings of the 2nd International Conference on Typed Lambda Calculi and Applications (TLCA-95)*. Berlin, Heidelberg: Springer, pp. 266–278.

Martin-Löf, P., 1984. *Intuitionistic Type Theory*, Bibliopolis.

Russell, B., 1938. *Principles of Mathematics* 2nd ed., W.W. Norton.

Scott, D., 1974. Axiomatizing set theory. In *Axiomatic Set Theory (Proceedings of the Symposium on Pure Mathematics, Vol. XIII, Part II, University of California, Los Angeles, California, 1967)*. American Mathematics Society, Providence, R.I., pp. 207–214.

Zermelo, E., 1930. Über grenzzahlen und mengenbereiche: Neue untersuchungen über die grundlagen der mengenlehre. *Fundamenta Mathematicae*, 16, pp.29–47.

This presentation is not the normal one found in textbooks, which is that every non-empty set contains an element that is disjoint from itself, but that presentation is more brain-bending, and is implied by the statement presented here if you include the axiom of dependent choice.↩

Here, is the type given to types, similar to the

*kind*`*`

in Haskell, and is not a reference to the sets of axiomatic set theory.↩Agda makes use of explicit

*universe polymorphism*instead, and I’m still undecided which version of type theory I like better.↩In set theory, it’s a valid question to ask, just the answer is always “no”.↩

*Falso*is a registered trademark of Estatis, Inc. All Rights Reserved.↩

```
module 2015-08-23-verified-compiler where
open import Data.Fin hiding (_+_) renaming (#_ to i)
open import Data.Nat hiding (_≟_)
open import Data.Vec hiding (_>>=_; _⊛_)
```

Recently my research has been centered around the development of a self-certifying compiler for a functional language with linear types called Cogent (see O’Connor et al. [2016]). The compiler works by emitting, along with generated low-level code, a proof in Isabelle/HOL (see Nipkow et al. [2002]) that the generated code is a refinement of the original program, expressed via a simple functional semantics in HOL.

As dependent types unify for us the language of code and proof, my current endeavour has been to explore how such a compiler would look if it were implemented and verified in a dependently typed programming language instead. In this post, I implement and verify a toy compiler for a language of arithmetic expressions and variables to an idealised assembly language for a virtual stack machine, and explain some of the useful features that dependent types give us for writing verified compilers.

*The Agda snippets in this post are interactive! Click on a symbol to see its definition.*

One of the immediate advantages that dependent types give us is that we can encode the notion of *term wellformedness* in the type given to terms, rather than as a separate proposition that must be assumed by every theorem.

Even in our language of arithmetic expressions and variables, which does not have much of a static semantics, we can still ensure that each variable used in the program is bound somewhere. We will use indices instead of variable names in the style of de Bruijn [1972], and index terms by the *number of available variables*, a trick I first noticed in McBride [2003]. The `Fin`

type, used to represent variables, only contains natural numbers up to its index, which makes it impossible to use variables that are not available.

```
data Term (n : ℕ) : Set where
Lit : ℕ → Term n
_⊠_ : Term n → Term n → Term n
_⊞_ : Term n → Term n → Term n
Let_In_ : Term n → Term (suc n) → Term n
Var : Fin n → Term n
```

This allows us to express in the *type* of our big-step semantics relation that the environment `E`

(here we used the length-indexed `Vec`

type from the Agda standard library) should have a value for every available variable in the term. In any Isabelle specification of the same, we would have to add such length constraints as explicit assumptions, either in the semantics themselves or in theorems about them. In Agda, the dynamic semantics are extremely clean, unencumbered by irritating details of the encoding:

```
infixl 5 _⊢_⇓_
data _⊢_⇓_ {n : ℕ} ( E : Vec ℕ n) : Term n → ℕ → Set where
lit-e : ∀{n}
-------------
→ E ⊢ Lit n ⇓ n
times-e : ∀{e₁ e₂}{v₁ v₂}
→ E ⊢ e₁ ⇓ v₁
→ E ⊢ e₂ ⇓ v₂
---------------------
→ E ⊢ e₁ ⊠ e₂ ⇓ v₁ * v₂
plus-e : ∀{e₁ e₂}{v₁ v₂}
→ E ⊢ e₁ ⇓ v₁
→ E ⊢ e₂ ⇓ v₂
---------------------
→ E ⊢ e₁ ⊞ e₂ ⇓ v₁ + v₂
var-e : ∀{n}{x}
→ E [ x ]= n
-------------
→ E ⊢ Var x ⇓ n
let-e : ∀{e₁}{e₂}{v₁ v₂}
→ E ⊢ e₁ ⇓ v₁
→ (v₁ ∷ E) ⊢ e₂ ⇓ v₂
---------------------
→ E ⊢ Let e₁ In e₂ ⇓ v₂
```

By using appropriate type indices, it is possible to extend this technique to work even for languages with elaborate static semantics. For example, linear type systems (see Walker [2005]) can be encoded by indexing terms by type contexts (in a style similar to Oleg). Therefore, the boundary between being *wellformed* and being *well-typed* is entirely arbitrary. It’s possible to use relatively simple terms and encode static semantics as a separate judgement, or to put the entire static semantics inside the term structure, or to use a mixture of both. In this simple example, our static semantics only ensure variables are in scope, so it makes sense to encode the entire static semantics in the terms themselves.

Similar tricks can be employed when encoding our target language, the stack machine . This machine consists of two stacks of numbers, the *working* stack and the *storage* stack , and a program to evaluate. A program is a list of *instructions*.

There are six instructions in total, each of which manipulate these two stacks in various ways. When encoding these instructions in Agda, we index the `Inst`

type by the size of both stacks before and after execution of the instruction:

```
data Inst : ℕ → ℕ → ℕ → ℕ → Set where
num : ∀{w s} → ℕ → Inst w s (suc w) s
plus : ∀{w s} → Inst (suc (suc w)) s (suc w) s
times : ∀{w s} → Inst (suc (suc w)) s (suc w) s
push : ∀{w s} → Inst (suc w) s w (suc s)
pick : ∀{w s} → Fin s → Inst w s (suc w) s
pop : ∀{w s} → Inst w (suc s) w s
```

Then, we can define a simple type for programs, essentially a list of instructions where the stack sizes of consecutive instructions must match. This makes it impossible to construct a program with an underflow error:

```
data SM (w s : ℕ) : ℕ → ℕ → Set where
halt : SM w s w s
_∷_ : ∀{w′ s′ w″ s″} → Inst w s w′ s′ → SM w′ s′ w″ s″ → SM w s w″ s″
```

We also define a simple sequential composition operator, equivalent to list append (`++`

):

```
infixr 5 _⊕_
_⊕_ : ∀{w s w′ s′ w″ s″} → SM w s w′ s′ → SM w′ s′ w″ s″ → SM w s w″ s″
halt ⊕ q = q
(x ∷ p) ⊕ q = x ∷ (p ⊕ q)
```

The semantics of each instruction are given by the following relation, which takes the two stacks and an instruction as input, returning the two updated stacks as output. Note the size of each stack is proscribed by the type of the instruction, just as the size of the environment was proscribed by the type of the term in the source language, which eliminates the need to add tedious wellformedness assumptions to theorems or rules.

```
infixl 5 _∣_∣_↦_∣_
data _∣_∣_↦_∣_ : ∀{w s w′ s′}
→ Vec ℕ w → Vec ℕ s
→ Inst w s w′ s′
→ Vec ℕ w′ → Vec ℕ s′
→ Set where
```

The semantics of each instruction are as follows:

- (where ), pushes to .

```
num-e : ∀{w s}{n}{W : Vec _ w}{S : Vec _ s}
-------------------------
→ W ∣ S ∣ num n ↦ n ∷ W ∣ S
```

- , pops two numbers from and pushes their sum back to .

```
plus-e : ∀{w s}{n m}{W : Vec _ w}{S : Vec _ s}
----------------------------------------
→ (n ∷ m ∷ W) ∣ S ∣ plus ↦ (m + n ∷ W) ∣ S
```

- , pops two numbers from and pushes their product back to .

```
times-e : ∀{w s}{n m}{W : Vec _ w}{S : Vec _ s}
-----------------------------------------
→ (n ∷ m ∷ W) ∣ S ∣ times ↦ (m * n ∷ W) ∣ S
```

- , pops a number from and pushes it to .

```
push-e : ∀{w s}{n}{W : Vec _ w}{S : Vec _ s}
--------------------------------
→ (n ∷ W) ∣ S ∣ push ↦ W ∣ (n ∷ S)
```

- (where ), pushes the number at position from the top of onto .

```
pick-e : ∀{w s}{x}{n}{W : Vec _ w}{S : Vec _ s}
→ S [ x ]= n
----------------------------
→ W ∣ S ∣ pick x ↦ (n ∷ W) ∣ S
```

- , removes the top number from .

```
pop-e : ∀{w s}{n}{W : Vec _ w}{S : Vec _ s}
-------------------------
→ W ∣ (n ∷ S) ∣ pop ↦ W ∣ S
```

As programs are lists of instructions, the evaluation of programs is naturally specified as a list of evaluations of instructions:

```
infixl 5 _∣_∣_⇓_∣_
data _∣_∣_⇓_∣_ {w s : ℕ}(W : Vec ℕ w)(S : Vec ℕ s) : ∀{w′ s′}
→ SM w s w′ s′
→ Vec ℕ w′ → Vec ℕ s′
→ Set where
halt-e : W ∣ S ∣ halt ⇓ W ∣ S
_∷_ : ∀{w′ s′ w″ s″}{i}{is}
→ {W′ : Vec ℕ w′}{S′ : Vec ℕ s′}
→ {W″ : Vec ℕ w″}{S″ : Vec ℕ s″}
→ W ∣ S ∣ i ↦ W′ ∣ S′
→ W′ ∣ S′ ∣ is ⇓ W″ ∣ S″
--------------------------
→ W ∣ S ∣ (i ∷ is) ⇓ W″ ∣ S″
```

The semantics of sequential composition is predictably given by appending these lists:

```
infixl 4 _⟦⊕⟧_
_⟦⊕⟧_ : ∀{w w′ w″ s s′ s″}{P}{Q}
→ {W : Vec ℕ w}{S : Vec ℕ s}
→ {W′ : Vec ℕ w′}{S′ : Vec ℕ s′}
→ {W″ : Vec ℕ w″}{S″ : Vec ℕ s″}
→ W ∣ S ∣ P ⇓ W′ ∣ S′
→ W′ ∣ S′ ∣ Q ⇓ W″ ∣ S″
-------------------------
→ W ∣ S ∣ P ⊕ Q ⇓ W″ ∣ S″
halt-e ⟦⊕⟧ ys = ys
x ∷ xs ⟦⊕⟧ ys = x ∷ (xs ⟦⊕⟧ ys)
```

Having formally defined our source and target languages, we can now prove our compiler correct — even though we haven’t written a compiler yet!

One of the other significant advantages dependent types bring to compiler verification is the elimination of repetition. In my larger Isabelle formalisation, the proof of the compiler’s correctness largely duplicates the structure of the compiler itself, and this tight coupling means that proofs must be rewritten along with the program — a highly tedious exercise. As dependently typed languages unify the language of code and proof, we can merely provide the correctness proof: in almost all cases, the correctness proof is so specific, that the program of which it demonstrates correctness can be *derived automatically*.

```
open import Data.Product
Exists = ∃
syntax Exists (λ x → y ) = ∃[ x ] y
open import Function
open import Data.String
```

We define a compiler’s correctness to be the commutativity of the following diagram, as per Hutton and Wright [2004].

As we have not proven determinism for our semantics^{1}, such a correctness condition must be shown by the conjunction of a *soundness* and *completeness* condition, similar to Bahr [2015].

**Soundness** is a proof that the compiler output is a *refinement* of the input, that is, every evaluation in the output is matched by the input. The output does not do anything that the input doesn’t do.

```
-- Sound t u means that u is a sound translation of t
Sound : ∀{w s} → Term s → SM w s (suc w) s → Set
Sound {w} t u = ∀{v}{E}{W : Vec ℕ w}
→ W ∣ E ∣ u ⇓ (v ∷ W) ∣ E
-----------------------
→ E ⊢ t ⇓ v
```

Note that we generalise the evaluation statements used here slightly to use arbitrary environments and stacks. This is to allow our induction to proceed smoothly.

**Completeness** is a proof that the compiler output is an *abstraction* of the input, that is, every evaluation in the input is matched by the output. The output does everything that the input does.

```
Complete : ∀{w s} → Term s → SM w s (suc w) s → Set
Complete {w} t u = ∀{v}{E}{W : Vec ℕ w}
→ E ⊢ t ⇓ v
-----------------------
→ W ∣ E ∣ u ⇓ (v ∷ W) ∣ E
```

It is this *completeness* condition that will allow us to automatically derive our code generator. Given a term , our generator will return a Σ-type, or *dependent pair*, containing a program called and a proof that is a complete translation of :

```
codegen′ : ∀{w s}
→ (t : Term s)
→ Σ[ u ∈ SM w s (suc w) s ] Complete t u
```

For literals, we simply push the number of the literal onto the working stack:

```
codegen′ (Lit x ) = _ , proof
where
proof : Complete _ _
proof lit-e = num-e ∷ halt-e
```

The code above never explicitly states what program to produce! Instead, it merely provides the completeness proof, and the rest can be inferred by unification. Similar elision can be used for variables, which pick the correct index from the storage stack:

```
codegen′ (Var x) = _ , proof
where
proof : Complete _ _
proof (var-e x) = pick-e x ∷ halt-e
```

The two binary operations are essentially the standard translation for an infix-to-postfix tree traversal, but once again the program is not explicitly emitted, but is inferred from the completeness proof used.

```
codegen′ (t₁ ⊞ t₂) = _ , proof (proj₂ (codegen′ t₁)) (proj₂ (codegen′ t₂))
where
proof : ∀ {u₁}{u₂} → Complete t₁ u₁ → Complete t₂ u₂ → Complete _ _
proof p₁ p₂ (plus-e t₁ t₂) = p₁ t₁ ⟦⊕⟧ p₂ t₂ ⟦⊕⟧ plus-e ∷ halt-e
codegen′ (t₁ ⊠ t₂) = _ , proof (proj₂ (codegen′ t₁)) (proj₂ (codegen′ t₂))
where
proof : ∀ {u₁}{u₂} → Complete t₁ u₁ → Complete t₂ u₂ → Complete _ _
proof p₁ p₂ (times-e t₁ t₂) = p₁ t₁ ⟦⊕⟧ p₂ t₂ ⟦⊕⟧ times-e ∷ halt-e
```

The variable-binding form pushes the variable to the storage stack and cleans up after evaluation exits the scope with .

```
codegen′ (Let t₁ In t₂)
= _ , proof (proj₂ (codegen′ t₁)) (proj₂ (codegen′ t₂))
where
proof : ∀ {u₁}{u₂} → Complete t₁ u₁ → Complete t₂ u₂ → Complete _ _
proof p₁ p₂ (let-e t₁ t₂)
= p₁ t₁ ⟦⊕⟧ push-e ∷ (p₂ t₂ ⟦⊕⟧ pop-e ∷ halt-e)
```

We can extract a more standard-looking code generator function simply by throwing away the proof that our code generator produces.

```
codegen : ∀{w s}
→ Term s
→ SM w s (suc w) s
codegen {w}{s} t = proj₁ (codegen′ {w}{s} t)
```

We use an alternative presentation of the soundness property, that makes explicit several equalities that are implicit in the original formulation of soundness. We prove that our new formulation still implies the original one.

```
open import Relation.Binary.PropositionalEquality
Sound′ : ∀{w s} → Term s → SM w s (suc w) s → Set
Sound′ {w} t u = ∀{E E′}{W : Vec ℕ w}{W′}
→ W ∣ E ∣ u ⇓ W′ ∣ E′
------------------------------------------
→ (E ≡ E′) × (tail W′ ≡ W) × E ⊢ t ⇓ head W′
sound′→sound : ∀{w s}{t}{u} → Sound′ {w}{s} t u → Sound t u
sound′→sound p x with p x
... | refl , refl , q = q
```

As our soundness proof requires us to do a lot of rule inversion on the evaluation of programs, we need an eliminator for the introduction rule `_⟦⊕⟧_`

, used in the completeness proof, which breaks an evaluation of a sequential composition into evaluations of its component parts:

```
⊕-elim : ∀{w s w′ s′ w″ s″}
{W : Vec ℕ w}{S : Vec ℕ s}
{W″ : Vec ℕ w″}{S″ : Vec ℕ s″}
{a : SM w s w′ s′}{b : SM w′ s′ w″ s″}
→ W ∣ S ∣ a ⊕ b ⇓ W″ ∣ S″
→ ∃[ W′ ] ∃[ S′ ] ((W ∣ S ∣ a ⇓ W′ ∣ S′) × (W′ ∣ S′ ∣ b ⇓ W″ ∣ S″))
⊕-elim {a = halt} p = _ , _ , halt-e , p
⊕-elim {a = a ∷ as} (x ∷ p) with ⊕-elim {a = as} p
... | _ , _ , p₁ , p₂ = _ , _ , x ∷ p₁ , p₂
```

Then the soundness proof is given as a boatload of rule inversion and matching on equalities, to convince Agda that there is no other way to possibly evaluate the compiler output:

```
soundness : ∀{w s}{t : Term s} → Sound′ {w} t (codegen t)
soundness {t = Lit x} (num-e ∷ halt-e) = refl , refl , lit-e
soundness {t = Var x} (pick-e x₁ ∷ halt-e) = refl , refl , var-e x₁
soundness {t = t₁ ⊠ t₂} x
with ⊕-elim {a = codegen t₁ ⊕ codegen t₂} x
... | _ , _ , p , _
with ⊕-elim {a = codegen t₁} p
... | _ , _ , p₁ , p₂
with soundness {t = t₁} p₁ | soundness {t = t₂} p₂
soundness {t = t₁ ⊠ t₂} x
| _ ∷ _ ∷ _ , ._ , _ , times-e ∷ halt-e
| ._ ∷ ._ , ._ , _ , _
| refl , refl , a
| refl , refl , b
= refl , refl , times-e a b
soundness {t = t₁ ⊞ t₂} x
with ⊕-elim {a = codegen t₁ ⊕ codegen t₂} x
... | _ , _ , p , _
with ⊕-elim {a = codegen t₁} p
... | _ , _ , p₁ , p₂
with soundness {t = t₁} p₁ | soundness {t = t₂} p₂
soundness {t = t₁ ⊞ t₂} x
| _ ∷ _ ∷ _ , ._ , _ , plus-e ∷ halt-e
| ._ ∷ ._ , ._ , _ , _
| refl , refl , a
| refl , refl , b
= refl , refl , plus-e a b
soundness {t = Let t₁ In t₂} x
with ⊕-elim {a = codegen t₁} x
... | _ ∷ _ , _ , p₁ , push-e ∷ q
with ⊕-elim {a = codegen t₂} q
... | _ ∷ _ , _ ∷ _ , p₂ , _
with soundness {t = t₁} p₁ | soundness {t = t₂} p₂
soundness {t = Let t₁ In t₂} x
| _ ∷ ._ , ._ , _ , push-e ∷ q
| _ ∷ ._ , ._ ∷ ._ , _ , pop-e ∷ halt-e
| refl , refl , a
| refl , refl , b
= refl , refl , let-e a b
```

Now that we have a verified code generator, as a final flourish we’ll implement a basic compiler frontend^{2} for our language and run it on some basic examples.

We define a surface syntax as follows. In the tradition of all the greatest languages such as BASIC, FORTRAN and COBOL, capital letters are exclusively used, and English words are favoured over symbols because it makes the language readable to non-programmers. I should also acknowledge the definite influence of PHP, Perl and `sh`

on the choice of the `$`

sigil to precede variable names. The sigil `#`

precedes numeric literals as Agda does not allow us to overload them.

```
data Surf : Set where
LET_BE_IN_ : String → Surf → Surf → Surf
_PLUS_ : Surf → Surf → Surf
_TIMES_ : Surf → Surf → Surf
$_ : String → Surf
#_ : ℕ → Surf
infixr 4 LET_BE_IN_
infixl 5 _PLUS_
infixl 6 _TIMES_
infix 7 $_
infix 7
```

Unlike our `Term`

AST, this surface syntax does not include any scope information, uses strings for variable names, and is more likely to be something that would be produced from a parser. In order to compile this language, we must first translate it into our wellformed-by-construction `Term`

type, which necessitates *scope-checking*.

```
open import Data.Maybe
open import Category.Monad
open import Category.Applicative
import Level
open RawMonad (monad {Level.zero})
open import Relation.Nullary
```

```
check : ∀{n} → Vec String n → Surf → Maybe (Term n)
check Γ (LET x BE s IN t) = pure Let_In_ ⊛ check Γ s ⊛ check (x ∷ Γ) t
check Γ (s PLUS t) = pure _⊞_ ⊛ check Γ s ⊛ check Γ t
check Γ (s TIMES t) = pure _⊠_ ⊛ check Γ s ⊛ check Γ t
check Γ (# x) = pure (Lit x)
check Γ ($ x) = pure Var ⊛ find Γ x
where
find : ∀{n} → Vec String n → String → Maybe (Fin n)
find [] s = nothing
find (x ∷ v) s with s ≟ x
... | yes _ = just zero
... | no _ = suc <$> find v s
```

Note that this function is the only one in our development that is partial: it can fail if an undeclared variable is used. For this reason, we use the `Applicative`

instance for `Maybe`

to make the error handling more convenient.

Our compiler function, then, merely composes our checker with our code generator:

```
compiler : Surf → Maybe (SM 0 0 1 0)
compiler s = codegen <$> check [] s
```

Note that we can’t really demonstrate correctness of the scope-checking function, save that if it outputs a `Term`

then there are no scope errors in , as it is impossible to construct a `Term`

with scope errors. One possibility would be to define a semantics for the surface syntax, however this would necessitate a formalisation of substitution and other such unpleasant things. So, we shall gain assurance for this phase of the compiler by embedding some test cases and checking them automatically at compile time.

If we take a simple example, say:

```
example = LET "x" BE # 4
IN LET "y" BE # 5
IN LET "z" BE # 6
IN $ "x" TIMES $ "y" PLUS $ "z"
```

We expect that this program should correspond to the following program:^{3}

```
result : SM 0 0 1 0
result = num 4
∷ push
∷ num 5
∷ push
∷ num 6
∷ push
∷ pick (i 2)
∷ pick (i 1)
∷ times
∷ pick (i 0)
∷ plus
∷ pop
∷ pop
∷ pop
∷ halt
```

We can embed this test case as a type by constructing an equality value — that way, the test will be re-run every time it is type-checked:

```
test-example : compiler example ≡ just result
test-example = refl
```

As this page is only generated when the Agda compiler type checks the code snippets, we know that this test has passed! Hooray!

Working in Agda to verify compilers is a very different experience from that of implementing a certifying compiler in Haskell and Isabelle. In general, the *implementation* of a compiler phase and the *justification of its correctness* are much, much closer together than in Agda than in my previous approach. This allows us to save a lot of effort by deriving programs from their proofs.

Also, dependent types are sophisticated enough to allow arbitrary invariants to be encoded in the structure of terms, which makes it possible, with clever formalisations, to avoid having to discharge trivial proof obligations repeatedly. This is in stark contrast to traditional theorem provers like Isabelle, where irritating proof obligations are the norm, and heavyweight tactics must be used to discharge them en-masse.

My next experiments will be to try and scale this kind of approach up to more realistic languages. I’ll be sure to post again if I find anything interesting.

Bahr, P., 2015. Calculating certified compilers for non-deterministic languages. In R. Hinze & J. Voigtländer, eds. *Mathematics of Program Construction*. Lecture Notes in Computer Science. Springer International Publishing, pp. 159–186.

de Bruijn, N.G., 1972. Lambda Calculus Notation with Nameless Dummies: a Tool for Automatic Formula Manipulation with Application to the Church-Rosser Theorem. *Indagationes Mathematicae (Elsevier)*, 34, pp.381–392.

Hutton, G. and Wright, J., 2004. Compiling exceptions correctly. In *Mathematics of Program Construction*. Springer, pp. 211–227.

McBride, C., 2003. First-Order Unification by Structural Recursion. *Journal of Functional Programming*, 13(6), pp.1061–1075.

Nipkow, T., Paulson, L.C. and Wenzel, M., 2002. *Isabelle/HOL: A Proof Assistant for Higher-Order Logic*, Springer.

O’Connor, L., Keller, G., Murray, T., Klein, G., Rizkallah, C., Chen, Z., Sewell, T., Lim, J., Amani, S., Nagashima, Y. and Hixon, A., 2016. Cogent: Certified compilation for a Functional Systems Language. *Submitted to POPL. Currently under review*.

Walker, D., 2005. Substructural Type Systems. In B. C. Pierce, ed. *Advanced Topics in Types and Programming Languages*. MIT Press.

I have written before about the use of total, purely functional languages to eliminate cumbersome low-level reasoning. By having a denotational semantics that are about as straightforward as highschool algebra, we can make many verification problems substantially simpler. Totality and the absence of effects^{1} means that we can pick redexes in any order, and have a wellfounded induction principle for datatypes [Turner 2004].

In existing large-scale verified software artifacts like seL4 however, we still use *C* as our implementation language^{2}, despite the fact that it is positively hellish to reason about [Koenig 1988]. The reasons for this are numerous, but there are two main ones. The first concern, and the least important, is that most purely functional language implementations, and certainly all total ones, depend on a runtime, which would enlarge the trusted computing base, and compromise the system’s efficiency. The second and more pressing concern is that for low-level systems like microkernels, or even services such as drivers and file system servers, are forced to confront the reality of the von Neumann architecture. Sometimes they might need to manipulate some bytes in memory, or perform potentially unsafe pointer arithmetic. If we are to follow traditional systems orthodoxy, they simply cannot be efficiently expressed at the level of abstraction of say, Haskell.

This has meant that these systems are forced to choose an implementation language which requires no runtime support and which supports all of these unsafe features. Sadly, traditional “systems” languages such as C, while satisfying this criteria, will always extract their pound of flesh when it comes to verification. The huge cost to verifiability that comes with allowing divergence and unsafe memory access and so on is not just paid where those semantic features are used, but *everywhere in the program*. The majority of the seL4 proofs are concerned with re-establishing refinement invariants between a state-monadic executable specification and the C implementation. This specification is semantically equivalent, more or less, to the C implementation, but the proof is huge. The majority of obligations are about things like pointer validity, heap validity, type safety, loop termination and so on — things that we don’t have to worry about in total, pure languages.

My research project is focused on reducing the cost of verification by replacing the systems implementation language with one that has a straightforward denotational semantics, about which correctness properties can be easily stated. This language is under a number of constraints: It can’t rely on a runtime, and it must have minimal performance overhead compared to a manually written C implementation. Furthermore, the compilation needs to be highly trustworthy.

The language Cogent, which we will submit to POPL this year, is essentially a linear lambda calculus, with a bunch of usability features, such as pattern matching, records and a simple C FFI. The use of linear types eliminates the need for a garbage collector, and allows for efficient implementation of a purely functional semantics using destructive updates [Wadler 1990]. Indeed, two semantics are ascribed to the language: One which resembles any pure functional language (a denotation in terms of sets and total functions), and one which is of a much more imperative flavour (with a state monadic denotation). We have proven that the imperative semantics is a refinement of the pure semantics for any well-typed program, and the compiler *co-generates* an efficient C implementation and a proof, in Isabelle/HOL [Nipkow et al. 2002], that this imperative semantics is an abstraction of the C code it generates.

To verify properties about programs written in this language, it suffices to use the simple equational reasoning tactics used for any HOL program. Hoare logic and stateful reasoning are gone, and high-level properties are generally stated as equations between terms. The snag, however, is in the C FFI. As the language is so restrictive (total, safe, and with no indirectly observable mutation), the C FFI is used heavily to provide everything from red-black trees to loop iterators. While Cogent lets us reason nicely about code written in it, the moment a C function is used it produces a number of thorny proof obligations, essentially requiring us to show that, at least to an outside observer, the C code has not violated the invariants assumed by Cogent’s type system.

We were able to express the vast majority of an Ext2 file system implementation in Cogent, and the verification of file systems written in Cogent is certainly easier than a raw C implementation. However, there are a number of places in the file system implementation where efficiency is sacrificed in order to be able to express the file system in a safe way. For example, deserialising structures from a disk buffer into file system structures is done byte-by-byte, rather than by a memory map.

The flaw in Cogent’s approach is that it’s all-or-nothing. If a program can’t be expressed in this highly restrictive language, it must be expressed in C, and then all of Cogent’s verification advantages are lost.

To remedy this, I first designed a *spectrum* of languages, each differing primarily in their treatment of memory.

The language is more or less Cogent as it exists now. Linear types allow for the extremely simple semantics that make verification much easier. The language is less restrictive, by doing away with linear types and bringing explicit references to the language. This introduces manual memory management, and stateful reasoning, however the memory model remains fairly high level. It is possible to leak memory or invalidate pointers in , unlike , but the lack of linear types now permits programs that rely on sharing mutable pointers, such as efficient data structure implementations. The lowest level language is also stateful, but the state consists of a heap of bytes. Here, pointer arithmetic is permitted, as well as arbitrary casting and reinterpretation of memory, more or less on the same semantic level as C.

Clearly, the compiler for can simply compile to , and to . Once in , it is straightforward to emit implementation code in LLVM IR or C. The advantage of this spectrum is that we can allow the programmer access to *every* level of abstraction. I plan to achieve this by allowing code from to be written *inline* inside , and both of these in turn inside , in the vein of the inline assembly language supported by most C compilers. The crucial point here is that each of these inline blocks will generate a *proof obligation* to ensure that, externally, the inline block is “well-behaved” with respect to the abstractions of the higher level language. For example, embedding inside generates the obligation that any valid pointers left by the program point to data of the right type. Embedding inside requires showing that all available pointers are valid, no memory was leaked, and that there is at most one pointer to each object remaining. I am exploring the possibility of making each of these languages embedded inside a dependently-typed programming language such as Agda or Idris, to allow the proofs to be expressed along with the program.

These different languages are all total. Trying to fit recursion or loops into our spectrum leads to some unfortunate consequences. Putting them on the “Liberty” side would mean throwing away our nice memory model whenever we want to write a loop, and putting it to the left of on the “Safety” side would mean that every time we step down to a lower level of memory abstraction we would also be obliged to prove termination. So, rather than a *spectrum*, we have a *lattice* of languages, with non-total languages running parallel to the total ones:

The compiler moves from towards , and to embed a language from the upper-left corner inside a language towards the lower-right corner requires a proof.

I have yet to completely clarify exactly what these languages would look like. It’s my hope that they can share a large amount of syntax, to avoid confusing programmers. Once I work out the details, I suspect this approach will allow programmers to implement systems in a reasonably high-level way, but breaking some language-based abstractions when necessary, where the only costs to verifiability come directly from these points were the abstraction is broken.

Koenig, A., 1988. *C traps and pitfalls*, Pearson Education India.

*Isabelle/HOL: A Proof Assistant for Higher-Order Logic*, Springer.

Turner, D., 2004. Total functional programming. *Journal of Universal Computer Science*, 10, pp.187–209.

Wadler, P., 1990. Linear types can change the world! In M. Broy & C. Jones, eds. *IFIP TC 2 Working Conference on Programming Concepts and Methods, Sea of Galilee, Israel*. North Holland Publishing, pp. 347–359.

On the 23rd of April 2014, the first ever live theorem-proving competition was held at FP-SYD, with predictably great results.

The six contestants, fueled by nothing by alcoholic beverages and armed only with their favorite Coq environment and a handful of basic tactics, set out to prove a set of challenging lemmas in elimination rounds of five minutes or less.

Commentary/sledging was provided helpfully by our commentary team, including Boey Maun Suang, Thomas Sewell and Erik de Castro Lopo (as well as our contestants on the sidelines).

The final rankings were:

- Amos Robinson
- Liam O’Connor (me!)
- Tony Sloane
- Dom De Re and Eric Willigers (tied)
- Ben Lippmeier

Amos took home the chicken trophy, and a good time was had by all. It’s a remarkably fun event, both for those watching (many of whom were not familiar with Coq or theorem proving), and for those participating (I found it quite exhilirating!).

Many thanks go to Ben Lippmeier who organised basically everything and calibrated each lemma for difficulty, and to Atlassian for hosting FP-SYD.

]]>

For the first semester of 2014, I will be teaching tutorials for the following two courses:

COMP1927 Computing 2, with Gabi Keller as LIC, an introductory course for data structures and algorithms.

SENG2011 Software Engineering Workshop 2A, a course that has recently been rejuvenated by Carroll Morgan to cover formal and informal methods of writing programs to meet specifications.

I will also be assisting in the administration of COMP3141 Software System Design and Implementation, taught by Manuel Chakravarty, a course focusing on the use of equational properties to drive software development, ranging from QuickCheck-style testing to formal verification in Agda.

COMP4141 Theory of Computation is running again, and I once again encourage anyone interested to enrol. It doesn’t run every year, and it’s fascinating and educational. Kai Engelhardt sadly won’t be teaching it, as he is busy teaching COMP2111 System Modelling and Design, which will be reinvented along similar lines to SENG2011, although I don’t know if that’s happening this time around.

COMP6752 Modelling Concurrent Systems, formerly known as Comparative Concurrency Semantics, is being run by Rob van Glabbeek. I will be attending this course. It covers all sorts of models and semantics for concurrent systems, notions of semantic equivalence, modal and temporal logics, and so on. For those who enjoyed Foundations of Concurrency last year, this is a natural continuation, and it’s taught by a luminary in the field, to boot.

For those wishing to choose electives, in addition to the above I recommend COMP3821 Extended Algorithms, COMP3153 Algorithmic Verification, and COMP3891 Extended Operating Systems.

]]>