Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Comparing Haskell and Liquid Haskell: Type Checking and Advanced Topics, Study notes of Calculus

The comparison between Haskell and Liquid Haskell, focusing on their approaches to type checking and advanced topics such as kinds, data type promotion, type families, and refinement types. It also discusses the benefits of using Liquid Haskell for theorem proving and verification.

Typology: Study notes

2021/2022

Uploaded on 09/27/2022

manager33
manager33 🇬🇧

4.4

(34)

241 documents

1 / 64

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Verification of Haskell programs
using Liquid Haskell
Morten Aske Kolstad
Det matematisk-naturvitenskapelige fakultet
UNIVERSITETET I OSLO
November 15, 2019
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40

Partial preview of the text

Download Comparing Haskell and Liquid Haskell: Type Checking and Advanced Topics and more Study notes Calculus in PDF only on Docsity!

Verification of Haskell programs

using Liquid Haskell

Morten Aske Kolstad

Det matematisk-naturvitenskapelige fakultet

UNIVERSITETET I OSLO

November 15, 2019

Contents

Chapter 1

Introduction

1.1 Motivation

Issues with formal verification

Formal verification gives us the obvious benefits of reducing errors, increasing reliability and an increased trust in the proven system. However, formal verification of programs are not common-place and are mostly reserved for some special use cases. The use of theoerem provers or programming languages with the posibility to do formal verification, such as the theorem prover Coq or the dependently typed languages Agda or Idris, are mostly used in academia. There are many reasons why this is the case, but one of the most important reason is that formal verification is hard.

The techniques used to do formal verification in dependently typed languages use advanced type system and complicated proof tactics. The process of doing formal verifction also takes a lot of effort because of the precision needed, and it often leads to code that is a lot more complex than the code without verification. The performance of code written in dependently typed languages

Liquid Haskell

Liquid Haskell, a verifier for Haskell programs, tries to avoid these issues. With the use of refinement types and an SMT-solver, it tries to automate a lot of the verification. Instead of having the user doing proofs using advanced proof tactics, the user can now use equational reasoning. It avoids the performance problem, by being a stand-alone type checker, that does not impact the execution of the program, and therefore the code will be as performant as normal Haskell code.

It uses the most widely used kind of verification today, the type checker, by extending normal Haskell types with predicates. This is done by using extending the Haskell type system with Liquid types types, which are a

restricted form of refinement types designed to increase type inference.

The goal of this thesis is to find out how usable Liquid Haskell is in practice.

1.2 Contributions

The topic of this thesis is to explore Liquid Haskell. To investigate the power of Liquid Haskell, consists of two angles.

Comparing Haskell and Liquid Haskell is a case study, that compares the two approaches on two problems : proving the commutativity of adding natural numbers, and implementing compile time validity checking of propositional formulas. My contributions from this case study are as follows:

  • A comparison between verification done in Haskell and Liquid Haskell
  • A type level sequent calculus for propositional logic in Haskell

Verifying finger trees is a second case study. Here we look into the use of Liquid Haskell to verify properties of finger trees. Here we verified properties regarding the size of a finger tree, and also proved a property related to the functionality of splitting a finger tree.

In particular, my contributions regarding the use of Liquid Haskell on finger trees are the following:

  • An implementation of verified size properties of the multi-purpose data structure finger trees
  • An implementation that uses Liquid Haskell to prove properties of a non non-trivial data structures involving polymorphic recursion and algebraic structures

1.3 Chapter overview

Chapter 2 covers language extensions and techniques in Haskell that makes it possible to do type level programming. The topics introduced here, will be used in chapter 4.

Chapter 3 introduces Liquid Haskell and gives some insight into how it works. It also goes into depth about termination and totality, and why those are important topics in Liquid Haskell.

Chapter 4 contains a case study that compares doing theorem proving and verification in Haskell to doing it in Liquid Haskell. In particular, it compares the implementations of the solutions to the following problems :

Chapter 2

Haskell

The reader of this thesis is expected to have a decent understanding of functional programming and be somewhat familiar with Haskell, or at least be familiar with a somewhat similar language, like SML, OCaml or Idris.

Therefore basic syntax and functionality, such as type classes, higher order functions, pattern matching, Algebraic Data Types, will not be covered in detail. Knowledge of perhaps the most famous and infamous aspect of Haskell, Monads, is however not needed, because they are not used in this thesis. For a good introduction to Haskell and functional programming, the reader is referred to [Hut16].

However, in this thesis we will look into advanced techniques and functionality, such as type families, generalized algebraic data types and kinds. These techniques are neceassary when doing the type level programming that we will be doing in Chapter 3. These topics will be explained.

In this section we first look into the more advanced topics of kinds, kind polymorphism, generalized algebraic data types and type families.

2.1 Kinds and kind polymorphism

If we have the following data types

data T f = MakeT (f Int)

data Identity a = Identity a

data MyInt = MyInt Int

makeT (Identity 1) would typecheck, but makeT (MyInt 1) would not. Why is that?

Because the "type" of MyInt does not fit the "type" of f in T.

To understand why it does not fit, we need to understand kinds.

in normal ADTs they all have to be the same. We will shortly see how to use GADTs to define singleton types

Singleton types

A singleton type is a type with only one inhabitant. That means that if we know the value, we know that type and if we know the type, we know the value. So a singleton type has a bijective mapping from type to value.

data Bool = False | True

is not a singleton type, because if we have a value of type Bool, we can’t decide whetever the value is False or True, based on the type.

But if we use GADTs to add a type parameter and specify the type

--B is to be used as a Kind data B = F | T

data GBool b where False :: GBool F True :: GBool T

But the GBool data type that was defined using GADTs would have been a singleton type. Because we know that if the value is False, then the type is Bool F, and if the type is Bool F, the value must be False. The same holds for True and Bool T.

We will now see how to define a singleton type for natural numbers.

Singleton types without GADTs?

If we try to make a singleton data type for natural numbers using normal ADTs, a first effort would maybe look something like this:

data Natural n = Zero | Succ (Natural n)

The types given to the data constructors now will be : Zero :: forall n

. Natural n and Succ :: forall n. Natural n -> Natural n. This makes non-sensical types (in this context) such as Zero :: Natural String and Succ Zero :: Natural [Bool] accepted. To avoid this problem, we try to constrain the kind of the type parameter to be of kind Nat.

data Natural (n :: Nat) = Zero | Succ (Natural n)

With the kind annotation, we specify that the type parameter needs to be of kind Nat, so that it represents a natural number at the type level. The types given now will be : Zero :: forall (n :: Nat). Natural n and Succ :: (n :: Nat). Natural n -> Natural n. Now types such as Zero :: Natural String and Succ Zero :: Natural [Bool] won’t be

accepted, because the type parameters in those examples are not of kind Nat.

But we have another problem The problem is that we can’t specify what the type in the type parameter should be for the different constructors Zero and Succ.

That means that we can have Zero :: Natural (S Z) or Zero :: Natu- ral (S (S Z)) or Succ Zero :: Natural Z

This means that we don’t have a singleton type. We don’t have a one-to- one mapping from the type to the value, because as we can see, the value Zero is of both type Natural (S Z) and Natural (S (S Z))

Singleton types using GADTs

What we need is some way to specify that Zero is a constructor with the type Natural Z and Succ is a constructor that takes a natural number with type parameter n, and then returns a natural number with the type parameter (S n), to represent the additional layer of Succ.

data Natural (n :: Nat) where Zero :: Natural Z Succ :: Natural n -> Natural (S n)

Now the data constructors have the appropriate types, namely

Zero :: Natural Z Succ :: forall (n :: Nat). Natural n -> Natural (S n)

and

Succ :: forall (n :: Nat). Natural n -> Natural (S n)

Now we have an actual singleton type. If the value is Zero, we know that the type is Natural Z, and if the type is Natural Z, we know that the value is Zero. If the value is Succ (nat :: Natural n), we know that the type is Natural (S n), and if the type is Natural (S n), we know that the the value is Succ (nat :: n), by using the inductive hypothesis.

2.4 Type families

To enable type level programming in Haskell, we have so far introduced a way to make custom kinds, which enables the creation of data types at the type level. But to use these custom kinds , we also need functions that work on types.

Type families are type level functions. They are enabled by the TypeFami- lies extension[GHC15c]

As a simple example, we can look at a type family that is a type level implementation of the not function, that takes a type of kind B as argument and returns a type of kind B.

Chapter 3

Liquid Haskell

3.1 Liquid Haskell

Liquid Haskell (LH) is a refinement type checker that uses Liquid Types. In this section we will look at an overview of the features in Liquid Haskell. For a detailed look at Liquid Haskell, the reader is referred to [VSJ14a].

Staticly typed languages , like Haskell, can prevent a lot of runtime errors at compile time, by type checking. That is, to verify the type safety of a program. That a program is type safe means that is does not contain type errors. With this, you can catch errors like if you have a function that expects a string, but you give it an integer as an argument instead.

But what about if you have a function that expects a number between 0 and

  1. Or a function that expects a list with 2 or more elements. How would one use a normal type system to verify that these properties are being held.

This is where refinement types come into play.

Refinement types

Refinement types makes it possible to encode invariants by combing types and logical predicates [VSJ+14b]. These predicates needs to be SMT- decidable, which means that they can only include formulas from decidable logics. This is to help automation of the type checking.

SMT : Satisfiablity modulo theories

Satisfiablity modulo theories (SMT) is a generalization of boolean satisfia- bility. It does that by adding additional first-order theories, such as equality reasoning, arithmetic, arrays and more [RR08].

Logically Qualified Data Types - Liquid types

Logically Qualified Data Types, abbreviated to Liquid types, was intro- duced in [RKJ08]. Liquid types are a restricted form of dependent types,

Proofs in Liquid Haskell

When doing non-trivial proofs in Liquid Haskell, we need to use proof combinators. These enables us to do equational reasoning.

The ones used in this thesis are : === : the key combinator in equational reasoning. It ensures that the left and right hand side is equal.

? : adds a proof fact from the right side to the left side. Can be seen as the combinator that lets us use other proofs or lemmas.

*** and QED : creates a proof from any value.

Examples of equational proofs in Liquid Haskell will be seen in chapter 3, when proving properties of natural numbers, and in chapter 4, when proving properties of finger trees.

3.2 Termination and totality

The standard definition of a total function is a function that is defined for all possible input values. That means that it terminates and returns a value for all arguments. In an LH context, when referring to totality and total functions, the part about being defined for all input values is seperated out. So in an LH context totality checking refers to checking that the functions have a case or guard for every possible value of the input type. This is also a convention I will use in this thesis.

Totality and termination are important aspects in programming. We want to reason about the totality of our functions to ensure that they are defined for all the cases and we want to be sure that our functions terminates and not ends up in infite loops. In verified programs and theorem proving, there is also the extra dimension, in that they are required to make the verification sound. The following section will go into the following aspects of termination and totality

  • different techniques used in LH to prove termination
  • how to use non-terminating functions in LH
  • how non-termination can lead to falsehoods
  • The difference between the GHC exhaustiveness checker and LH totality checker
  • how partial function can lead to falsehoods

3.2.1 Termination

To ensure the soundness of the LH refinements, LH requires proof of termination for functions.

To say that a function terminates means that for all arguments, the function does not lead to infinite computation. That means a function that terminates will never loop for ever, but reach a base case, for every argument. Liquid Haskell uses a well founded metric, such as the lexiographic order of natural numbers or the structural size of ADTs, to verify that functions are terminating.

Structural termination

The structural termination checker is an automatic checker that can automaticly prove termination by detecting the common pattern where the argument to the recursive call is a subterm of the original function argument [VBK+18]. Most of the functions in this thesis are accepted by the structural termination checker.

To compute the length of a linked-list is common in functional program- ming.

length [] = 0 length (x:xs) = 1 + length xs

The argument to the recursive call xs is a subterm of the original function argument x:xs This is recognized by the structural termination checker, and therefore it accepts the definition as terminating.

More complex termination and semantic termination

But the structual termination checker is not always enough. There are a lot of functions that terminates, but where the recursion is more complex than just using subterms as arguments. Even a simple function such as

range n m = if n <= m then n : f (n+1) m else []

can not be proven to terminate using a structural checker. The first argument in the recursive call of range is not a subterm of the original first argument. So the structural size of the arguments does not decrease, but is there anything else that decreases? Yes, the difference between the two arguments. Since the recursion only happens when n <= m, that means that when n increases, n-m decreases. n-m is the metric that shows that this function terminates.

We have now moved from structural recursion to a more advanced termi- nation checking that the LH authors calls Semantic termination[VBK+18]. This is done by providing an explicit termination argument, which is an ex- pression that decreases in each call and calculated from the function argu- ment. The termination argument is on the form [e_1, e_2, ...., e_n] where the expressions e_i most often depends on the function arguments (we will see one example later where we use a constant expression to prove the termination of mutual recursive functions). The expressions must eval- uate to natural numbers and they must lexicographically decrease at each recursive function call.