30 September 2022

What to Do to Make Sure Your Code Works

As part of my Ph.D., I have written an article with two co-authors that has just been published as part of the proceedings of Logic in Computer Science (LICS) 2022. LICS is the flagship conference in type theory and the general use of mathematical logic to describe programming languages. The title is "Greatest HITs: Higher Inductive Types in Coinductive Definitions via Induction Under Clocks" and in this article, I give a little introduction to what it is about.

AUHTOR

Magnus Baunsgaard Kristensen

Senior Consultant

The core question of the field I worked with, dependent type theory, is as follows: What do you do to make sure your code works? Not just in specific scenarios, but mathematically sure that for all possible inputs to the program an output is produced that follows the requirements specification. A possible answer in this situation is to use dependent type theory.

Type theory is a logical language that can be used both as a foundation for mathematics and a programming language. Specifically, a type theory corresponds to a functional language with strong types that come with a set of automatic guarantees about programs written in the language. In addition to these built-in guarantees, dependent type theory allows you to write program specifications in type theory itself, using it as a logical language. One can then mathematically prove that a program implements a specification with both program, proof and specification written in the same language.

Dependent type theory has two limitations: one as a logic and the other as a programming language. The logical problem can be solved with Cubical Type Theory (CTT), and the problem on the programming side can be solved with Clocked Type Theory (CloTT). In the article, they are combined into Clocked Cubical Type Theory (CCTT), which with Higher Inductive Types (HITs) becomes an expressive programming language with an appropriate logic.

Space

The logical limitation of dependent type theory is its handling of equality, which must be said to be at the core of any logic. For example, if we think about which function should be equal, then there is a decision to be made. Will we make it possible to distinguish functions that are defined in different ways but produce the same output? Perhaps we are interested in a logic where functions that take the same amount of time to run are always equal, regardless of input and output, so we can use it to describe resource consumption in a system. A common choice is not to distinguish between functions that give the same output for all inputs. This choice, functional extensionality, is critical when using type theory as a logic to reason about features at the requirements specification level. Here we are only interested in what our program actually does. Thus, the concept of equality is a central part of a logic, which determines which situations a logic can describe.

Dependent type theory in its modern form borrows from the mathematical field of homotopy theory. Equality in homotopy theory has a geometric interpretation: The objects you work with are, loosely speaking, geometric objects and two points in a shape are said to be equal if you can draw a line from one point to the other within the shape. It may seem rather abstract that you have to think of, for example, the types of programs that produce a number as a geometric shape. However, using geometric paths as a concept of equality has many good consequences. In addition to fulfilling the things one normally requires of equality, one can, for example, prove that it supports functional extensionality.

Time

To most, the definition of 'programming language' includes Turing completeness, which loosely speaking means “everything a computer is capable of”. Type theory as a programming language is not Turing-complete by default, as it lacks general recursion, i.e., the ability to define a function that calls itself. The reason why this is not part of the standard package for type theory is that general recursion allows us to prove all statements in the logic, thus preventing us from using the theory as a mathematical logic.

Guarded recursion is a way to allow recursive definitions without allowing the pathological proofs of logical statements. Here, a tool from modal logic is used to describe recursion by adding an explicit marking in the language that the recursive call should happen 'later'. This adds a way for the language to keep track of the order in which different tasks are performed, and it can thus prevent a calculation from having to use the result of a program that is not run until later.

Verification of critical software

Guarded recursion and CTT existed before my work, and they have even been combined before in different ways. The difference between the existing systems and what is described in the article is the specific treatment of coinductive types. It is easiest to understand the difference in an example.

The archetypal coinductive type is a stream of inputs - for example, a temperature measurement from a nuclear reactor every second. To understand programs that depend on such a stream of inputs, we use a coinductive stream of numbers. The coinductive programs will automatically be productive. Roughly speaking, this is a guarantee that the program will always produce its output based on a temperature measurement in finite time.

Where co-induction becomes important is in the specification process. Previous systems had different and stricter restrictions on what information they had access to about their programs. Specifically, it was not possible to compare programs that update after each temperature measurement and those that aggregate data from for the day's run, despite the fact that they may be functionally equivalent.

In addition to this, we have extended the system with the so-called HITs. A simple example of an inductive type is lists. In type theory, you think of lists as being built up from an empty list by adding elements. So, we have two ways to make a list. Either we can take the empty list, or we can also add an element to an existing list. Now suppose we did not care about the order of the objects we have in our list. So, we think that in the logic you should see [a,b] and [b,a] as equal, which for us means that there should be a path between the two lists. There is no way to do that in the inductive type of lists, but with a higher inductive type such a path can be inserted manually. The situation would then be that you had three ways to create a list: the empty one, compositions as before and now a list which is on the path between [a,b] and [b,a].

The basic research perspectives are long, and we are some years away from having an implemented programming language based on CCTT, but the article has provided a direction to work in.

The broad perspective of the article is that we are now adding one of the most important properties of type theory to the system: The ability to define new typeformers on-the-fly. If you have a problem that can be solved mathematically with coinduction, it can probably be described in the article’s system and thus justified mathematically.