
























































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The comparison between Haskell and Liquid Haskell, focusing on their approaches to type checking and advanced topics such as kinds, data type promotion, type families, and refinement types. It also discusses the benefits of using Liquid Haskell for theorem proving and verification.
Typology: Study notes
1 / 64
This page cannot be seen from the preview
Don't miss anything!
1.1 Motivation
Formal verification gives us the obvious benefits of reducing errors, increasing reliability and an increased trust in the proven system. However, formal verification of programs are not common-place and are mostly reserved for some special use cases. The use of theoerem provers or programming languages with the posibility to do formal verification, such as the theorem prover Coq or the dependently typed languages Agda or Idris, are mostly used in academia. There are many reasons why this is the case, but one of the most important reason is that formal verification is hard.
The techniques used to do formal verification in dependently typed languages use advanced type system and complicated proof tactics. The process of doing formal verifction also takes a lot of effort because of the precision needed, and it often leads to code that is a lot more complex than the code without verification. The performance of code written in dependently typed languages
Liquid Haskell, a verifier for Haskell programs, tries to avoid these issues. With the use of refinement types and an SMT-solver, it tries to automate a lot of the verification. Instead of having the user doing proofs using advanced proof tactics, the user can now use equational reasoning. It avoids the performance problem, by being a stand-alone type checker, that does not impact the execution of the program, and therefore the code will be as performant as normal Haskell code.
It uses the most widely used kind of verification today, the type checker, by extending normal Haskell types with predicates. This is done by using extending the Haskell type system with Liquid types types, which are a
restricted form of refinement types designed to increase type inference.
The goal of this thesis is to find out how usable Liquid Haskell is in practice.
1.2 Contributions
The topic of this thesis is to explore Liquid Haskell. To investigate the power of Liquid Haskell, consists of two angles.
Comparing Haskell and Liquid Haskell is a case study, that compares the two approaches on two problems : proving the commutativity of adding natural numbers, and implementing compile time validity checking of propositional formulas. My contributions from this case study are as follows:
Verifying finger trees is a second case study. Here we look into the use of Liquid Haskell to verify properties of finger trees. Here we verified properties regarding the size of a finger tree, and also proved a property related to the functionality of splitting a finger tree.
In particular, my contributions regarding the use of Liquid Haskell on finger trees are the following:
1.3 Chapter overview
Chapter 2 covers language extensions and techniques in Haskell that makes it possible to do type level programming. The topics introduced here, will be used in chapter 4.
Chapter 3 introduces Liquid Haskell and gives some insight into how it works. It also goes into depth about termination and totality, and why those are important topics in Liquid Haskell.
Chapter 4 contains a case study that compares doing theorem proving and verification in Haskell to doing it in Liquid Haskell. In particular, it compares the implementations of the solutions to the following problems :
The reader of this thesis is expected to have a decent understanding of functional programming and be somewhat familiar with Haskell, or at least be familiar with a somewhat similar language, like SML, OCaml or Idris.
Therefore basic syntax and functionality, such as type classes, higher order functions, pattern matching, Algebraic Data Types, will not be covered in detail. Knowledge of perhaps the most famous and infamous aspect of Haskell, Monads, is however not needed, because they are not used in this thesis. For a good introduction to Haskell and functional programming, the reader is referred to [Hut16].
However, in this thesis we will look into advanced techniques and functionality, such as type families, generalized algebraic data types and kinds. These techniques are neceassary when doing the type level programming that we will be doing in Chapter 3. These topics will be explained.
In this section we first look into the more advanced topics of kinds, kind polymorphism, generalized algebraic data types and type families.
2.1 Kinds and kind polymorphism
If we have the following data types
data T f = MakeT (f Int)
data Identity a = Identity a
data MyInt = MyInt Int
makeT (Identity 1) would typecheck, but makeT (MyInt 1) would not. Why is that?
Because the "type" of MyInt does not fit the "type" of f in T.
To understand why it does not fit, we need to understand kinds.
in normal ADTs they all have to be the same. We will shortly see how to use GADTs to define singleton types
A singleton type is a type with only one inhabitant. That means that if we know the value, we know that type and if we know the type, we know the value. So a singleton type has a bijective mapping from type to value.
data Bool = False | True
is not a singleton type, because if we have a value of type Bool, we can’t decide whetever the value is False or True, based on the type.
But if we use GADTs to add a type parameter and specify the type
--B is to be used as a Kind data B = F | T
data GBool b where False :: GBool F True :: GBool T
But the GBool data type that was defined using GADTs would have been a singleton type. Because we know that if the value is False, then the type is Bool F, and if the type is Bool F, the value must be False. The same holds for True and Bool T.
We will now see how to define a singleton type for natural numbers.
Singleton types without GADTs?
If we try to make a singleton data type for natural numbers using normal ADTs, a first effort would maybe look something like this:
data Natural n = Zero | Succ (Natural n)
The types given to the data constructors now will be : Zero :: forall n
. Natural n and Succ :: forall n. Natural n -> Natural n. This makes non-sensical types (in this context) such as Zero :: Natural String and Succ Zero :: Natural [Bool] accepted. To avoid this problem, we try to constrain the kind of the type parameter to be of kind Nat.
data Natural (n :: Nat) = Zero | Succ (Natural n)
With the kind annotation, we specify that the type parameter needs to be of kind Nat, so that it represents a natural number at the type level. The types given now will be : Zero :: forall (n :: Nat). Natural n and Succ :: (n :: Nat). Natural n -> Natural n. Now types such as Zero :: Natural String and Succ Zero :: Natural [Bool] won’t be
accepted, because the type parameters in those examples are not of kind Nat.
But we have another problem The problem is that we can’t specify what the type in the type parameter should be for the different constructors Zero and Succ.
That means that we can have Zero :: Natural (S Z) or Zero :: Natu- ral (S (S Z)) or Succ Zero :: Natural Z
This means that we don’t have a singleton type. We don’t have a one-to- one mapping from the type to the value, because as we can see, the value Zero is of both type Natural (S Z) and Natural (S (S Z))
Singleton types using GADTs
What we need is some way to specify that Zero is a constructor with the type Natural Z and Succ is a constructor that takes a natural number with type parameter n, and then returns a natural number with the type parameter (S n), to represent the additional layer of Succ.
data Natural (n :: Nat) where Zero :: Natural Z Succ :: Natural n -> Natural (S n)
Now the data constructors have the appropriate types, namely
Zero :: Natural Z Succ :: forall (n :: Nat). Natural n -> Natural (S n)
and
Succ :: forall (n :: Nat). Natural n -> Natural (S n)
Now we have an actual singleton type. If the value is Zero, we know that the type is Natural Z, and if the type is Natural Z, we know that the value is Zero. If the value is Succ (nat :: Natural n), we know that the type is Natural (S n), and if the type is Natural (S n), we know that the the value is Succ (nat :: n), by using the inductive hypothesis.
2.4 Type families
To enable type level programming in Haskell, we have so far introduced a way to make custom kinds, which enables the creation of data types at the type level. But to use these custom kinds , we also need functions that work on types.
Type families are type level functions. They are enabled by the TypeFami- lies extension[GHC15c]
As a simple example, we can look at a type family that is a type level implementation of the not function, that takes a type of kind B as argument and returns a type of kind B.
3.1 Liquid Haskell
Liquid Haskell (LH) is a refinement type checker that uses Liquid Types. In this section we will look at an overview of the features in Liquid Haskell. For a detailed look at Liquid Haskell, the reader is referred to [VSJ14a].
Staticly typed languages , like Haskell, can prevent a lot of runtime errors at compile time, by type checking. That is, to verify the type safety of a program. That a program is type safe means that is does not contain type errors. With this, you can catch errors like if you have a function that expects a string, but you give it an integer as an argument instead.
But what about if you have a function that expects a number between 0 and
This is where refinement types come into play.
Refinement types makes it possible to encode invariants by combing types and logical predicates [VSJ+14b]. These predicates needs to be SMT- decidable, which means that they can only include formulas from decidable logics. This is to help automation of the type checking.
SMT : Satisfiablity modulo theories
Satisfiablity modulo theories (SMT) is a generalization of boolean satisfia- bility. It does that by adding additional first-order theories, such as equality reasoning, arithmetic, arrays and more [RR08].
Logically Qualified Data Types, abbreviated to Liquid types, was intro- duced in [RKJ08]. Liquid types are a restricted form of dependent types,
When doing non-trivial proofs in Liquid Haskell, we need to use proof combinators. These enables us to do equational reasoning.
The ones used in this thesis are : === : the key combinator in equational reasoning. It ensures that the left and right hand side is equal.
? : adds a proof fact from the right side to the left side. Can be seen as the combinator that lets us use other proofs or lemmas.
*** and QED : creates a proof from any value.
Examples of equational proofs in Liquid Haskell will be seen in chapter 3, when proving properties of natural numbers, and in chapter 4, when proving properties of finger trees.
3.2 Termination and totality
The standard definition of a total function is a function that is defined for all possible input values. That means that it terminates and returns a value for all arguments. In an LH context, when referring to totality and total functions, the part about being defined for all input values is seperated out. So in an LH context totality checking refers to checking that the functions have a case or guard for every possible value of the input type. This is also a convention I will use in this thesis.
Totality and termination are important aspects in programming. We want to reason about the totality of our functions to ensure that they are defined for all the cases and we want to be sure that our functions terminates and not ends up in infite loops. In verified programs and theorem proving, there is also the extra dimension, in that they are required to make the verification sound. The following section will go into the following aspects of termination and totality
To ensure the soundness of the LH refinements, LH requires proof of termination for functions.
To say that a function terminates means that for all arguments, the function does not lead to infinite computation. That means a function that terminates will never loop for ever, but reach a base case, for every argument. Liquid Haskell uses a well founded metric, such as the lexiographic order of natural numbers or the structural size of ADTs, to verify that functions are terminating.
Structural termination
The structural termination checker is an automatic checker that can automaticly prove termination by detecting the common pattern where the argument to the recursive call is a subterm of the original function argument [VBK+18]. Most of the functions in this thesis are accepted by the structural termination checker.
To compute the length of a linked-list is common in functional program- ming.
length [] = 0 length (x:xs) = 1 + length xs
The argument to the recursive call xs is a subterm of the original function argument x:xs This is recognized by the structural termination checker, and therefore it accepts the definition as terminating.
More complex termination and semantic termination
But the structual termination checker is not always enough. There are a lot of functions that terminates, but where the recursion is more complex than just using subterms as arguments. Even a simple function such as
range n m = if n <= m then n : f (n+1) m else []
can not be proven to terminate using a structural checker. The first argument in the recursive call of range is not a subterm of the original first argument. So the structural size of the arguments does not decrease, but is there anything else that decreases? Yes, the difference between the two arguments. Since the recursion only happens when n <= m, that means that when n increases, n-m decreases. n-m is the metric that shows that this function terminates.
We have now moved from structural recursion to a more advanced termi- nation checking that the LH authors calls Semantic termination[VBK+18]. This is done by providing an explicit termination argument, which is an ex- pression that decreases in each call and calculated from the function argu- ment. The termination argument is on the form [e_1, e_2, ...., e_n] where the expressions e_i most often depends on the function arguments (we will see one example later where we use a constant expression to prove the termination of mutual recursive functions). The expressions must eval- uate to natural numbers and they must lexicographically decrease at each recursive function call.