The main direction of my research thus far has been to exploit linear types to make purely functional, equational reasoning possible for systems programming. I've talked about the difficulty of verifying imperative programs before, but my past entries about linear types did not discuss the advantages that they can bring to such a domain.

Chiefly, they allow us to reason about our program as if all data
structures are immutable, with all of the benefits that implies, while
the actual implementation performs efficient destructive updates to
mutable data structures. This is achieved simply by statically ruling
out every program where the difference between the immutable and the
mutable interpretations can be observed, by requiring that every
variable that refers, directly or indirectly, to a heap data structure,
must be used exactly once. As variables cannot be used multiple times,
this implies that for any allocated heap object, there is exactly one
live, usable reference that exists at any one time. This is called the
*uniqueness* property of linear types.

This is a very simple restriction, but it proves a considerable burden when trying to actually write programs. For example, a naïve definition of a linear array would become unusable after just one element was accessed! Other data structures, with complex reference layouts that involve multiple aliasing references and sharing, simply cannot be expressed.

For this reason, when designing the linear systems language Cogent
for my research, I allowed parts of the program to be written in unsafe,
imperative C, and those C snippets are able to manipulate opaque types
that are *abstract* in the purely functional portion. The author
of the code would then have to prove that the C code doesn't do anything
too unsafe, that would violate the invariants of the linear type
system.

Specifically, Cogent extends the (dynamic) typing relation for values
to include *sets of locations* which can be accessed from a
value^{1}. For example, the typing rule of for
tuple values is:

Observe how we have used these pointer sets to enforce that there is
no internal aliasing in the structure. It also gives us the information
necessary to precisely state the conditions under which a C program is
safe to execute. We define *stores*, denoted
, as a partial mapping from a location or pointer
to a value
.

Assuming a C-implemented function is evaluated with an input value and an input store , the return value and output store must satisfy the following three properties for all locations :

**Leak freedom**- , that is any input reference that wasn't returned was freed.**Fresh allocation**- , that is every new output reference, not in the input, was allocated in prevously-free space.**Inertia**- , that is, every reference not in either the input or the output of the function has not been touched in any way.

Assuming these three things, it's possible to show that the two
semantic interpretations of linear typed programs are equivalent, even
if they depend on unsafe, imperative C code. I called these three
conditions together the *frame conditions*, named after the
*frame problem*, from the field of knowledge representation. The
frame problem is a common issue that comes up in many formalisations of
stateful processes. Specifically, it refers to the difficulty of
*local reasoning* for many of these formalisations. The state or
store is typically represented (as in our Cogent formalisation above) as
a large, monolithic blob. Therefore, whenever any part of the state is
updated, every invariant about the state must be re-established, even if
it has nothing to do with the part of the state that was updated. The
above conditions allow us to state that the C program does not affect
any part of the state except those it is permitted (by virtue of the
linear references it recieved) to modify, thus allowing us to enforce
the type system invariants across the whole program.

Presenting such proof obligations in terms of stores and references as described above, however, is extremely tedious and difficult to work with when formally reasoning about imperative programs, particularly if the invariants we are trying to show are initially broken and only later re-established. Typically, imperative programs lend themselves to axiomatic semantics for verification, the most obvious example being Hoare Logic [@Hoare], which provides a proof calculus for a judgement written , which states that, assuming the initial state (which maps variables to values) satisfies an assertion , then the resultant state of running on , satisfies .

When our assertions involve references and aliasing, however, Hoare
Logic doesn't buy us much over just reasoning about the operational
semantics directly. A variety of ad-hoc operators have to be added to
the logic, for example to say that references do not alias, references
point to free space, or that references point to valid values. To make
this cleaner, we turn instead to the *Separation Logic*
[@Reynolds]. Separation logic is a variant of Hoare Logic that is
specifically designed to accommodate programming with references and
aliasing. It augments the state of Hoare Logic with a mutable store
, and the following additional assertions:

- A special assertion , which states that the store is empty, i.e if and only if .
- A binary operator
, which states that the store is defined at
*exactly one*location, i.e if and only if . - A
*separating conjunction*connective , which says that the store can be split into two disjoint parts and where and . - A
*separating implication*connective , which says that extending the store with a disjoint part that satisfies results in a store that satisfies .

Crucially, Separation Logic includes the *frame rule*, its own
solution to the frame problem, where an unrelated assertion
can be added to both the pre- and the post-condition of a given program
in a separating conjunction:

This allows much the same local reasoning that we desired before: The
program
can be verified to work for a store that satisfies
, but otherwise contains *no other values*. Then that program may
be freely used with a *larger* state and we automatically learn,
from the frame rule, that any unrelated bit of state cannot affect, and
is not affected by the program
.

Separation logic makes expressing these obligations substantially simpler. For example, given a program with an input pointers and and output pointers , we can express all three frame conditions as a single triple:

Here is a sketch of a proof that this implies the frame conditions listed above. Assume an input store . Split into disjoint stores and such that . Let the output store of running with be . Note that by the triple above, we have that .

We have by the frame rule that the output of running with the full store is where .

**Leak freedom**- For any arbitrary location , if but then we must show that . As , we know from that and, as they are disjoint, . Therefore, the only way for to be true is if , but as from , we can conclude that .**Fresh allocation**- If but then we must show that . We have from that , and hence . As they are disjoint, so the only way for to be true is if . But, as we know that from and , we can conclude that .**Inertia**- If and , then we can conclude from that and from that . If , then , thanks to the frame rule as shown above. If , then and and therefore we can say that as they're both undefined.

I think this is a much cleaner and easier way to state the frame conditions.

My next item to investigate is how I might integrate this into a seamless language and verification framework. My current thinking is to take a lambda calculus with linear types and refinement types, and augment it with an imperative embedded language, which allows several of the guarantees of the linear type system to be suspended. The imperative embedded language might resemble the Hoare-state monad (Swierstra 2009), only using Separation Logic rather than Hoare Logic, but I am still figuring out all the details.

### References

*Theorem proving in higher order logics*, Vol. 5674, Springer Berlin Heidelberg, pp. 440–451.

The real formalisation is a bit more complicated, allowing nonlinear

*read-only*pointers as well as linear, writable ones.↩︎