Quick Start to Functional Programming Through Haskell

module Tutorial where
import Prelude hiding (abs, const, head, length, map, negate, foldr, map, repeat, replicate, sum, tail, take, uncurry, (.))
import System.Environment

Introduction

Functional programming is a branch of the declarative paradigm. Such programs are built up by putting a series of function and type definitions after each other like they were dominoes and that is going to be eventually pushed over from the top, usually called a start expression. In this tutorial, we are going to use the Haskell programming language to demonstrate the basics of this programming style.

In functional programs, it is not fully determined by the author how the pieces will exactly fall into their places but the rules are given for that. In result, the value of the final computation is often the answer of the whole program. This gives a great freedom and focus to the programmer to describe problems from a rather abstract and mathematical perspective, through formulas, which immediately spring up as definition of functions, for example:

negate x = 0 - x

This function above calculates the negated value of a number by subtracting it from zero. We call this negate and it has a single parameter, x. It is separated from its body of definition by the = symbol, which gives us an equation. Note that a nice property of pure functions (like as the ones that can be defined in Haskell) that their application is basically a substitution of the expression on the left-hand side with the expression on the right-hand side of the equation.

In order to make use of this freshly defined function, place this line into a text file, name it Tutorial.hs. It could be then loaded for interpretation with the ghci program. That is the interactive frontend of the Glorious Glasgow Haskell Compiler, or GHC for short. (On Windows, it is just enough to double click on the icon for the file.)

Once the file was successfully loaded, the interpreter lets us to invoke and compose functions, in a fashion similar to a calculator. It also has a prompt that says Prelude>. That is the name of a module that is always loaded along with ours (unless it is disabled), and it contains many of the basic definitions we are also going to see.

The interaction with the interpreter is called the Read-Eval-Print (REP) loop. First, the interpreter tries to parse the input expression, then if it is successful, evaluates it, and finally it prints the result.

Unfortunately, the last expression did not seem to work out well. This is an example of when the compiler cannot make sense of the input, although it looks syntactically all right. It is the result of the semantical evalutation of the program, called type checking. We do not go into the exact details for now, only note that it is because in Haskell the - symbol is parsed for both the binary operator of subtraction and the unary operator of negation. This may cause a slight misunderstanding, so as a rule of thumb, it is better to put the negative numbers into parentheses to fix up the problem:

Exercise. Define the increment function that increases the value of its arguments. It shall work like this:

Note that the module has to be reloaded on each change to the program. This can be achieved by the :reload command. Commands start with the : (colon) symbol, which makes them different from expressions so they are processed differently.

As a shorthand, :r may be also used or simply : that repeats the last command. The :? command gives the list of all commands.

Branching in Functions

Function definitions are often not that straightforward as sometimes they have to involve branching. For example, assume that we would like to write a function that calculates the absolute value of a number. The corresponding algorithm needs to determine first if the number is less than zero then negate it only if that holds. In any other cases, it shall just leave the number as it is.

Note that since values cannot be changed through the program's lifetime — that is the property of single assignment — we will not touch the argument at all but return a fresh modified value instead.

Branching in Haskell programs can be implemented by the use of guard expressions, or simply "guards". Guards are Boolean-valued expressions that can be attached to the left-hand side of function definitions by the | (pipe) symbol.

Consider the following definition of a specialized version for division, where the actual calculation is performed only if the divisor is not zero:

x ./. y | y /= 0 = x / y

With this, the function body will only be evaluated if the guard condition is satisfied, otherwise the body in question is skipped and a next one is searched for, from the top to the bottom. In absence of further possible bodies, the interpreter complains that it will not be able to compute the return value of the function.

That is called a partial function — a function that may not map every input to an output. As it causes an exception to raise and therefore may grind the execution of the whole program to halt, a careful attention must be paid to avoid such definitions. The opposite of this is the total function, where every element has an image assigned.

Let us now define abs, the absolute value function using guards. Mind the indentation — there is an expectation from the compiler, called the off-side rule, that goes by this: for multiline definitions, the lines that follow the first line have to be indented with spaces (tab characters are not appreciated).

abs x | x < 0     = negate x
      | otherwise = x

There we can see two different bodies: the top one takes care of the exceptions, that is the negative numbers, and applies the previously defined negate function on them, and the bottom one handles everything else. That latter one is called "fallthrough" and that is the guard that lets everybody pass. It has otherwise as the condition, but it could be quickly checked the it corresponds to trivially true:

Note that since the semantics of guards is technically "first satisfied wins", if otherwise appears before anything else, it will just shadow (overlap with) the rest of the branches.

Haskell also includes an if function that acts almost like the if statement in other languages. However, there is an important difference: in Haskell, there are no statements, everything is considered an expression. As a consequence, if is treated as it was a function thus both of its branches have to be specified.

Along those lines, the definition of abs could be rethough with the use of if:

abs' x =
  if (x < 0) then (negate x)
             else x

But guards clearly show their advantage over the regular if construct when there are multiple branches, so their use is recommended usually.

Exercise. Define the sgn (signum) function that returns -1 for negative numbers, 1 for positives, and 0 for zero.

Working with Lists

Besides integers, Haskell features many other types (and it also lets us to define our own ones), for example, as we have seen with otherwise, we can work with Booleans. Another common type is the list, a heritage reaching far back to the first functional programming language, LISP (as in "list processing"). It could be used as an ordinary container for values. The simplest list is the empty list.

But one can also create a more advanced list by enumerating all of its elements:

or even letting the interpreter enumerate the elements and set only the lower and upper bounds:

Note that it is not the only way to build lists. Actually, lists are even built internally with the previously introduced empty list constant, and the "cons" operator, which is identified by the : symbol. The : operator takes and element and a list and puts them together into a new list. It is right-associative.

Thus it is possible to write functions on lists, for example the one that checks if the given list is empty or not:

isEmpty xs = xs == []

However, as it might have been expected, it is more common to approach this in a more different way, which is called pattern matching. Pattern matching is a method to syntactically compare values with a so-called pattern. That way, similarly to the use of guard conditions, a series of alternative function bodies can be listed, each of them assigned with a pattern. In run time, those patterns are checked against the actual argument in the order to definition and the body for the first matching is selected for evaluation, and the rest is left completely unevaluated.

For example, the previously function for checking if a list is empty or not, is usually phrased like that:

null [] = True
null _  = False

Here the first alternative matches for only the empty list and returns the True constant, while for any other lists a fallthough case is defined by a "wildcard" pattern, that matches for any value. Note that when a wildcard pattern is used, we cannot also access the value of the argument, so it is mostly used when an answer has to be returned irregardless of what was passed to the function. Thus the another name of wildcard pattern is "anonymous variable".

There is an inverse operation for list construction — that is about removing the symbol : from between the consequent elements therefore splitting the list into two sections. They are called the head and the tail of the list, respectively. Those operations can be only defined with pattern matching:

head (x:_) = x
head _     = error "head: empty list"

tail (_:xs) = xs
tail _      = error "tail: empty list"

Now there we can see that pattern matching is a bit more than just checking for equivalence: a pattern may contain "holes" that are covered by variable names. When the match is done, those holes may be filled with parts of the actual matching value that we can separately address and use in the function body.

In the definition head, a list will only match the first line if it was built up by the : symbol, otherwise (in case of the [] symbol, that is the empty list) it will give an error message. That is because there is no head (i.e. first element) for the empty list, and that is why it is a partial function.

There can be also seen that a function for extracting the tail of the list could be defined on the same account.

Exercise. Define the isSingleton function that determines if a list has exactly one element. It shall work for list of any size, and it shall work like this:

Enter the Recursion

When working with lists, it is not also uncommon the situation when the whole list has to be walked. For example, consider the equivalence of two lists. Since we do not know in advance how long each of the lists will be, some kind of an iteration has to be implemented.

In functional languages, the concept of recursion is used to express that. A function is called recursive when it refers to itself in its own definition. (Hence it is often simply said: "Recursion: see recursion.") That is why the imperative way of incrementing a number fails to work here:

x = x + 1

This gives us an infinite recursion, which is going to keep the interpreter occupied forever. Note that this loop of infinite calculations can be broken by the key combination of Ctrl+C.

Due to the lazy evaluation strategy that is employed in Haskell, such recursive functions may work quite well sometimes, though. Consider the following definition:

repeat x = x : (repeat x)

Although this computation will not stop either, but we can at least see the intermediate results, the endless series of 1s flooding the console. Curiously, thanks to laziness or on-demand evaluation of subexpressions, the following expression will compute in a rather finite time:

It looks quite logical: why to unfold the complete the list when we can answer the question by taking only the first element? That is why in Haskell one can work with infinite data structures, which sometimes makes the description of algorithms surprisingly elegant and concise.

But a finite version of repeat can be obtained if we enforce a limit on how many elements should be generated in total. This is achieved by branching on the value of a counter and carry on with the computation only if it is still a positive number. That way we have got a base and an inductive case. The evaluation of the base case does not include any self-reference so it could be answered immediately, while the inductive case does the recursion.

The finite version of repeat is called replicate and it can be recursively defined like this:

replicate n x
  | n <= 0    = []
  | otherwise = x : (replicate (n - 1) x)

Here replicate is always called with a number lesser than the previous one, otherwise this recursion would not terminate either. That is, when writing inductive cases, it is important make a step towards one of the base cases by changing at least one of the arguments on the recursive invocation. For replicate, this is done by subtracting one from the counter and checking if it reached (or went below) zero so the iteration may stop.

Exercise. Define the fromTo function that takes two integers and constructs a list by starting with the first parameter as the lower bound, and increasing adding further numbers until the second parameter as the upper bound is reached. It shall work like this:

More on Recursion

Not only numbers, but other values might also be "decremented" and therefore consumed in a recursive fashion. Lists are an example of that, where passing on only the tail of the list is a way to reach a common base case, the empty list, which can be regarded as the zero element of lists.

Consider the definition of the take function that is to take a given number number of elements from the beginning of the list, if there is so many. Otherwise it should just return the full list.

take _ []         = []
take n _ | n <= 0 = []
take n (x:xs)     = x : take (n - 1) xs

There we can notice that a combination of pattern matching and guards were applied to express two classes of base cases. The first line says if the input list is empty, regardless of the value of the second parameter, simply return the empty list. The second line says that if the number of values to take is not positive, then again return the empty list. Finally, the third line says that any other case (actually, when all the necessary conditions are satisfied so the execution reached this alternative) get the first element from the input list and continue building the list with an invocation of take where both the number and the list are shrinked (in the hope that they will eventually hit one of the previous cases).

Note that take is also a lazy function so it can work with infinite lists, such as the ones that are produced by repeat:

To understand how this is possible, let us briefly take a deeper look into the actual evaluation of the expression above.

In addition, it can be also spotted that replicate can be written as the composition of take and repeat that plays nicely with modular programming:

replicate n x = take n (repeat x)

Exercise. Define the append function that appends two lists together by putting all the elements of the second list after the elements of the first. It shall work like this:

In this exercise, we used the ` (backtick) symbol. That is a syntactical aid to push binary alphabetical function names between its arguments. We could have seen earlier that Haskell supports symbolical function names, that are often called operators. For operators, they are infix by default, while the name of other functions are put in prefix before its arguments. Quoting with this symbol helps to use prefix functions as they were infix ones for better readability.

Similarly, infix operators could be also turned into prefix ones by putting them into parentheses. That is, for example, it is possible to write the + operator in the following format:

The Essence of Recursion Over Lists: Folding

It is worth to discuss a bit that induction on lists follow, and thus can be captured with a more generic scheme, called "folding". This is about taking elements of a list and gradually adding them up into some accumulated value that will eventually give us a result. Notice that one may fold the list from left to the right:

( ... ((((z `f` x0) `f` x1) `f` x2) `f` x3) `f` ... ) `f` xn

x0 `f` (x1 `f` (x2 `f` (x3 `f` ( ... (xn `f` z) ... ))))

where f is the operator, x0, x1, x2, and so on are the consequent elements of the list to be folded, and z is an initial value for the accumulated value.

For example, consider the addition of elements in a list, where the operator is the addition operator (+), the accumulated value is the sum of the elements, and the initial value for that is zero. Based on what we have just learned about recursion over lists, it is actually easy to implement such a function:

sum []     = 0
sum (x:xs) = x + sum xs

Using the method above, we can see that the invocation of that function will basically turn the list argument into an expression like this:

Thanks to the abstraction power hides in functional programming, the previously mentioned scheme that describes the class of those function can be also expressed in Haskell with the help of higher-order functions. Higher-order function is a function that takes a function as parameter or returns a function as value. That is one of the perks of the functional approach, where functions are therefore first-class citizens, that is, they can act like values. For now, we will exploit only the first case, namely, when one of the parameters is a function, but later we will explore this idea further.

For beginning, let us take a simple but nice example of the higher-order functions, map. This function takes a function and a list, and returns a list where the function is applied to every element of the list.

Note that negate is used here without any arguments, apparently. But that is exactly what map assumes when we turn to its definition:

map _ []     = []
map f (x:xs) = f x : map f xs

The function that was passed gets is argument once it comes to the evaluation. Looks like we can just use the variable as it was a function — well, that is it!

Exercise. Define the filterList function that takes a Boolean-valued function, a predicate, and keeps only the elements where the function is satisfied, i.e. it is evaluated to True. It shall work like this:

In these examples, even is a predicate function that determines if its argument is an even number or not.

Taming Functions of Functions

Now, that we have seen how recursive and higher-order functions could be made, let us take a look on the "folding" scheme that was mentioned earlier. For the sake of simplicity, only the "fold right" version, the foldr function is going to be considered, but similarly the "fold left", i.e. the foldl function can be also given:

foldr _ z []     = z
foldr f z (x:xs) = x `f` (foldr f z xs)

The algorithm behind the function goes like this: when the list is empty, return the initial value for the accumulated value, otherwise take the first element (the head) of the list and combine it with the value computed from the rest (the tail) of the list recursively by the (right-associative) function we passed.

With the help of this scheme, now we can come up with another definition for the sum function, which takes only a single line:

sum xs = foldr (+) 0 xs

There we can see that the + function and the zero initial value are passed to foldr. It is easy to verify that if we replaced the parameters with those symbols in the definition of foldr, we will just get back the definition without that. That is why higher-order functions are more powerful: every function in the class they represent can be expressed only with giving it the appropriate function bodies.

Note that because the symbol of +, i.e. the function, was again used without its arguments, parentheses were employed to touch that. Of course, any function might be passed to foldr, even custom functions. The only true requirement for them is to have the proper type.

Speaking of types: so far we have carefully avoided to talk too much about them, while they are another important but often invisible component of Haskell programs. Although we have not yet mentioned the type of the functions, they are typed. In GHCi, types of expressions can be queried by the :type or the :t command. Let us observe the type of foldr:

At the first sight this all may be quite daunting. The first step of decyphering the answer is to note that the type information is attached to the name or symbol in question by the :: (colon-colon or double colon) symbol. Function types include the -> (type arrow) symbol, — which is actually an operator but in the space of types — whose purpose is to construct the type of the function from the type of the function's domain, A, and its range B:

For multi-parameter functions, one might have borrowed the classical mathematical notation, the use of Cartesian products, the x symbol for types.

Such a type operator indeed exists in Haskell as well, that is the , (comma), called product type:

But for technical reasons that we will see later, designers of the language instead went for using the type arrow everywhere:

In result, we can roughly interpret function type as follows. The first (n - 1) components correspond to the type of the arguments and the last, n-th element is the type of the returned value. According to this explanation, we can figure out that foldr has three arguments:

Beside the function types, there can be also noticed another type operator, that is the [] or list symbol. It is yet another function over types but it has a single argument, the type of the elements in the list. So, like -> does, it takes a type and constructs the type of the list of that type.

Armed with all this knowledge from the preceeding paragraphs, we can now learn more about the actual parameters of foldr: the first one must be a function, the second must be a value, and the last one must be a list. But the final piece of the puzzle is still missing: why do not we have some integer types, e.g. int everywhere, what are those letters? The letters in the types are type variables, and they introduce the notion of parametric polymorphism. That is, our foldr function does not work for a single type of values but it automatically becomes capable of handling values of almost any type. Only the exact relationships between the types appearing in the type of foldr function are described by variables. Thus, if we subtitute those variables by actual types we can obtain (infinitely) many versions of this abstract type.

For example, consider a = Integer and b = Bool, where Integer is the type of integer numbers and Bool is type of Booleans:

foldr :: (Integer -> Bool -> Bool) -> Bool -> [Integer] -> Bool

It is even allowed to assign the same type to both variables, for example we could have chosen them both a and b to be the Integer type.

foldr :: (Integer -> Integer -> Integer) -> Integer -> [Integer] -> Integer

Though, an important rule of subtitution of type variables that we must always replace the same type variable with the same type every time. Hence the following version becomes invalid, foldr cannot be used like this:

foldr :: (Bool -> Bool -> Bool) -> Integer -> [Integer] -> Integer

The way we had a type assigned to the functions we defined is called type inference. This is an algorithm that attempts to calculate the most generic, or as it is often called, the principal type for each of the functions. For foldr, the presented type is the most generic, because in the description of the function:

If the compiler fails to infer the type in a similar manner, it raises a type error and the compiler refuses to generate code for the given definition. That is why it is also called type checking, because the type inference cannot only free us from writing type declarations but it can also verify if the given definition makes sense at all. Therefore it is a form of semantical evaluation of the program.

foldr _ z []     = z
foldr f z (x:xs) = x `f` (foldr z f xs)

If we try to load this code to the interpreter, it will emit a nice type error message to tell that something is not quite right:

We can also note that many other variations of foldr can be typed, but they do not conform to the previous specification. So, even if type declarations are not always mandatory to set, it is considered a good engineering practice to explictly state the type of the major or top-level functions. This can help us to detect if the function's type changes suddenly, like in the case given below. In this example below, we just forgot to add the backticks around f which completely changed the meaning of the definition:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr _ z []     = []
foldr f z (x:xs) = x f (foldr f z xs)

Exercise. Define the andList function (in a single line) with foldr that takes a list of Booleans and performs the logical conjuntion, the && operator, between the elements. It shall work like that:

andList :: [Bool] -> Bool
andList []     = True
andList (x:xs) = x && andList xs

Partially Applied Functions

If we recall the other higher-order function, map, it also has a type that we can query:

Based on its definition, its type can be deduced by the same argument we used for foldr. And it can indeed work with the even predicate that we introduced for the filterList exercise (so a = Integer and b = Bool):

However, it is good to know that not only one-parameter functions may be passed to it exclusively. In Haskell, because we abandoned the mathematical intuition of using Cartesian products, a curious opportunity pops up that it makes possible to partially apply functions. To demonstrate this, consider a function that we will call const:

const :: a -> b -> a
const x _ = x

As an ordinary prefix function, it is invoked with its arguments listed after its name. But it is time to add that function application is actually single-parameter only. This can be shown if we add the parentheses that could have been omitted so far due to the left-associative nature of the function application operator, the space:

It looks like we have just created another function by simply omitting one of the parameters. Now this explains better why it was opted to represent functions that way. It also has the consequence that the -> type operator is right-associative, so the type of const can be written as follows:

This feature comes handy when it comes to using higher-order functions. Consider the use of map where the apparently two-parameter const is fitted by binding the first parameter:

None of the numbers appear in the result, except that True is exactly repeated five times after each other, which corresponds to the size of the list passed.

Exercise. Define the mapList function in terms of foldr. In order to express the functionality of map, try to construct a list as the result, and define a helper function of type (a -> b) -> a -> [b] -> [b] that takes the function parameter and folds with it. It shall work as the regular map function.

Types are Everywhere

As it might have been expected, all of the other featured functions have types as well, for example negate from earlier:

negate x = 0 - x

It almost fully matches with the mental image of types that we have built up until this point: the function takes a single parameter and produces a value of the same type. The only difference is the part that says Num a =>. This is called a type context or context for short, and we can quickly discover that it comes from the use of the - operator:

From our previous findings we already know that the presence of type variables refers to polymorphism, so it could be claimed that - may be applied to values of many different types. Type variables may be replaced with any Haskell type, but when a context associated with them, it becomes possible to restrict the values of the variable to types that are members of some class of types.

Num is just a class like that, and it means the "class of types that represent numbers". In Haskell, such types are for example the Integer, and Double, so - works for both of them but their respective versions are used behind the curtains. This can be thought of a form of overloading function names, so the same name could be reused for multiple types, or in the other way around, a function may behave differently in different type settings. That is called ad-hoc polymorphism.

But Num does not contain the - operator only, it comes with all the other common operators on numbers. This can be listed with the :info, or :i for short, command.

The class keyword introduces a class, followed by its name, which is Num this time, and a type variable. That is the variable we can use in describing a set of functions that of all should be present for the given type in order to be classified as a member. Membership declarations can be added any time and anywhere by instance definitions where the type-specific implementation of the overloaded names are assigned.

Mind that it could be misleading to think about type classes as classes in the object-oriented sense. Althought they bear some similarity, type classes can be rather viewed as abstract interfaces that pack a group of functions, called members of the class, to allow the reuse of names with multiple implementations.

Curiously, not only functions but values (well, which are actually constant functions of no parameters) may be ad-hoc polymorphic. Consider the type of the 1 constant:

For what it is worth, why to stick a specific type to the 1 symbol other than saying that is a number? It may be either an integer or a floating-point number, there is no information about its actual representation. The representation is (lazily) narrowed down as we start to use it. Consider the following expression:

Another type context appears that says that 1 / 2 is not a number but a fractional number, therefore excluding the possibility of being an integer due to the use of /, the fractional division. Fractional is a more restrictive subclass of Num:

And we can see that the interpreter eventually chooses a member type when it is forced to calculate the value of this expression:

This is due to the built-in defaulting mechanism of GHCi, and the type it chooses is Double. Using the :: symbol for attaching types, we can control how to narrow down, that is how to make monotonic a polymorphic type.

Though this should not be confused with casting: Haskell is a strictly typed language, thus it does not allow to change type of expressions withouting providing an appropriate conversion function. Without this principle, the type system would not be able to convey correctness guarantees.

Fortunately, many conversion functions exist for numbers, one of them that we can use here is realToFrac:

But it promptly becomes evident that this is just a quick fix for the problem. Because we first performed a calculation with a lower precision, we irremediably lost information — that is what the type system tries to make ourselves mindful about.

Embracing Lambdas

We have claimed that functions are first-class citizens in functional languages that means that they can act as values. Being a value implies that it does not have to be named in order to use it. Like in the previous examples, we used the 1 and 3 values without attaching dedicated identifiers to them. That holds for functions as well, as they could be written with the so-called lambda notation. Using the lambda symbol only is enough to define a function, we are not required to name it.

negate = \x -> 0 - x

There we can see that the lambda is actually written as \ (backslash) and it replaces the name of the function. The lambda symbol is a quantifier that binds a variable in a scope. This quantifier is employed to tell which variables and where in the body should be subtituted on the application of the function. The bound variables and their scope is delimited by the -> symbol, which is not the same as the previously introduced type arrow function, however it is again used to represent a function.

Let us drop the name of the function and use the right-hand side of the new naked definition of negate at the interpreter.

Multi-parameter functions can be expressed by nested lambdas, and it is now more visible that the lambda operator, or lambda abstraction is right-associative, which meets our previous suspicion:

Note that nested lambda functions may be simplified and written with less arrows:

Lambda abstractions can help us with reducing the need for introducing a fresh helper function every time when we want to use a higher-order function. Consider the definition of the length function that is to measure the length of a list: we have inlined the definition of the helper function, the function parameter of foldr at the place of application and kept everything in a single line.

length :: Num b => [a] -> b
length xs = foldr (\_ n -> n + 1) 0 xs

Exercise. Unroll the definition of length using recursion directly with the name listLength. It shall work identically to the length function.

Constructing Functions on the Spot

Furthermore, the lambda notation offers a way to construct functions inside functions, that is, to return function as result. For example, a function like that could be defined without much ado:

actuator :: Integer -> (Integer -> Integer)
actuator n
  | n <  0 = \x -> x - 1
  | n == 0 = \x -> x
  | n >  0 = \x -> x + 1

That is, the actuator function returns a different a function depending on its argument:

Another nice example of that is the uncurry function. Not a long ago, we have learned that in functional programming, multi-parameter functions are represented with a series of nested one-parameter functions. This is named "currying" after Haskell Curry, an American logicist from the last century, who laid down many of the foundations of the typed functional programming languages and he proposed this technique. Hence the name "uncurry" refers to the conversion from the curried format to the traditional one. The inverse of it is called curry.

uncurry :: (a -> b -> c) -> ((a, b) -> c)
uncurry f = \(x,y) -> f x y

Though, a careful reader may notice that — due to the right-associativity of the type arrow operator — the uncurry function could be actually written as a two-parameter one:

uncurry :: (a -> b -> c) -> (a, b) -> c
uncurry f (x,y) = f x y

Further Perks of the Lambda Notation

While we are at lambda functions, we add that there is a whole logical calculus, the lambda-calculus associated with them, created by Alonzo Church in the 1930s. They form the mathematical foundations, the computation model, the semantics of Haskell. Thus, one might say that lambda functions constitute the machine code of functional programs. In result, many of their features, such as the previously presented currying, are inherited from there. Another curious operation of functions is called the eta-conversion, and it is as follows.

The rule states that any function that has a function application with E as the function and its parameter, x as the argument in its body, and x does not occur as a variable not bound by any lambda quantifier in E or does not occur in E at all, it equals to E. Or, conversely, a function body may be always extended with a quantified variable on the left and the right, it will not change its meaning.

The eta-conversion can be frequently exploited to give shorter function definitions, consider the following version of the foldr-based sum function:

sum' :: Num a => [a] -> a
sum' = foldr (+) 0

Composing Functions

This amplifies the fact that we can directly construct functions from other functions. Functions can be put together with the function composition operator, which can be expressed with our current knowledge in Haskell with no fuss:

infixr 9 .
(.) :: (b -> c) -> (a -> b) -> (a -> c)
f . g = \x -> f (g x)

An extra to the previous definitions is the first line here, that sets the associativity and predecence for the function because its infix. As it was earlier said, symbolic operator names, like the . in this case, or alphabetical names quoted with the backtick symbol automatically become infix ones, so they may appear after the infix family of keywords. The infixr keyword is for right-associative functions, and infixl is for left-associative ones (which is the default), finally, the infix is for functions that are not associative. The number between the keyword and the name denotes the precedence: it goes from zero to nine, where zero is the weakest, and nine is the strongest (which is default).

The function composition operator closely follows the mathematical definition and it can used like this:

As we can see, the lambda notation gets in our way in such situtations, that is why the designers of Haskell decided add a syntactical rule that allows to omit the variables in a way similar to the one that we have seen for the eta-conversion. That is called sections, and their application can often make the expressions more comprehensible:

This possibility gives space to more experimentation too, and we can have many different modes of composing functions, for example consider that one:

infixr 3 &&&
(&&&) :: (a -> b1) -> (a -> b2) -> (a -> (b1, b2))
f &&& g = \x -> (f x, g x)

It passes the same value to different functions and creates a pair out of their results.

Exercise. Define the >>> operator, aka. the forward composition. It shall work like this:

Hiding the Function Parameter

The composition of higher-order functions like the . and &&& opens up a very unique way of writing programs. Consider the definition of the average function below, that determines the average of the elements in a list:

average :: Fractional a => [a] -> a
average = uncurry (/) . (sum &&& length)

The function first takes the sum and length of the list (at the same time) and then it divides that two. The speciality of this style that it never names the parameter, that is why it is called "point-free" or "tacit" programming. The name of this technique comes from geometry, and the point here refers to the parameter.

Exercise. Define the squareRoot function by implementing the Newton iteration with the following functions: \x -> (x + a / x) / 2, iterate, dropWhile, and head. Use the lambda function to calculate the next approximation of the square root of a, which is now 2:

The iterate function helps with running the iteration without an end, so apply the dropWhile function to drop the approximations that are not close enough to the expected result, the number whose square is almost the same as the number whose square root we are looking for, and take the first appropriate one with head. A limit for the approximation can be set by a sufficiently small epsilon value.

More information on the referenced functions can be found in Hoogle, the API search engine of Haskell.

Lists as Sets

If we take a step further, another function can be defined that determines the average deviation of the elements. That is, it calculates the average of the elements (a mean), followed by the absolute difference from it for each of them and finally takes the average of that.

averageDeviation :: (Ord a, Fractional a) => [a] -> a
averageDeviation xs = average [ abs (x - mean) | x <- xs ]
  where mean = average xs

We could have used the classical recursion technique for lists, but it is useful to know that the language offers us some syntactical aid for that, known as "list comprehensions". It borrows the mathematical intuition of Zermelo-Fraenkel (ZF) set expressions, where a set is described by a characteristic function, for example:

This describes the set of square of even numbers. In Haskell, it can be written almost in the very same fashion. Here [0..] is a list that corresponds to the set of natural numbers, because it is infinite list (a list without upper bound) that contains every integer starting from zero, at least in theory.

[ x^2 | x <- [0..], even x ]

Though the Haskell version is actually a list that is computed: elements may appear multiple times in the result, and they have a clearly defined order. That is the reason why some of the ZF-sets will not work as expected, for example:

[ (x,y) | x <- [0..], y <- [0..] ]

Theoretically, this should give a list of Cartesian products on natural numbers. But in Haskell, the <- (left arrow) is rather interpreted as a loop on the elements of the list on its right-hand side, and using that operator multiple times constructs a nested loop of walking the lists. With finite lists, this is shown nicely:

Note that the use of list comprehensions with a single list is nothing more than a map:

map f xs = [ f x | x <- xs ]

filter p xs = [ x | x <- xs, p x ]

Local Definitions

What we can see in addition to that is the where keyword. With the help of that, a local variable or function can be added. Local definitions always see the names and other local definitions of the host definition (as they are global to them). This way local functions can share information with their host that allows for a better definition, for example, for foldr:

foldr f z xs = go xs
  where
    go []     = z
    go (x:xs) = x `f` (go xs)

But the where keyword can be combined with pattern matching that makes possible to write functions like partition, that returns multiple results as an n-tuple. The partition function takes a predicate and splits a list into two based on that:

partition :: (a -> Bool) -> [a] -> ([a], [a])
partition p [] = ([], [])
partition p (x:xs)
  | p x        = (x:xs1,xs2)
  | otherwise  = (xs1,x:xs2)
  where
    (xs1,xs2) = partition p xs

Finally, where is suitable for telling the compiler that we would like to cache the result, so it should not be computed more than once. Because our expressions are pure, this is feasible. In the definition of averageDeviation, the mean variable is treated exactly like that.

One final remark: note that in the type, there are multiple contexts present. They are inherited from the functions used in the implementation: the Fractional restriction is the result of the use of the average function, while Ord is the consequence of abs. That is, we can conclude here that restrictions may implicitly add up through function compositions.

Interaction with the World Outside

Even after all of these adventures, a thrilling question may still remain: how do pure functional programs interact with the outside world, how they do I/O? A pure approach to this problem would be to work with a start expression that receives the environment of the program and it passes down to any of the function needs that. That is path what was taken in the programming language Clean, but in Haskell, this problem gave rise to the inclusion of computational contexts, monads.

We will not go into the details for now, but the key to the solution is a type called IO that is to mark that a given function has a side effect. Consider the following pure function, a constant:

Every time we evaluate this constant, it will give the very same result, which is expected from a pure value. But now let us take an effectful constant:

This happens because getLine itself is not a function but an "action", namely an IO action. Actions are different from functions in a sense that they have to be "run" in order to get their results. GHCi has an internal support for running IO actions, it automatically passes them the (invisible) program environment, for example the state of the buffer of the standard input — so the user is requested for some input first, then it is passed to the getLine action.

IO is a type similar to the list, it has a parameter that characterizes the type of the result that is returned after the effect of the action was played. This also makes possible for the type system to clearly separate the pure and the effectul parts of the program, as they cannot be mixed and matched: there is no function to play the effects of an I/O action inside the program. That will be done by the run-time system for the whole start expression at once, which is called main in Haskell.

For example, assume that we want to wrap a program around our earlier functions that gets integers from the command-line as arguments, converts them to numbers (from strings) and determines the average deviation for them, finally prints the result to the standard output:

main :: IO ()
main = do
  args <- getArgs
  let xs = [ read x | x <- args ]
  print (averageDeviation xs)

There we can see that the type of main is IO () that means that it is an effectful function that does not return any value. Well, it returns some, denoted by () but it is a type that has only a single element, (). It can be considered as the void type in other languages, so main is basically a void function, a procedure.

The body of main is elaborated with a special syntax, named do notation. Jokingly, its purpose is to make imperative programmers believe that imperative-style code blocks can be embedded into Haskell programs. This goes well with the current situation, but in reality, it is much more than meets the eye: the do notation provides a way to compose effectful actions into blocks, so the values they compute in each of the steps can be passed to other actions. Mind that the indentation matters again, as elements of the same block should start at the same column. In return, we are saved from adding semicolons at each line's end, and also saved from adding block markers.

To bind a name to a result of some computation in the block, the <- (bind) operator is used. This can be thought of a variable assignment operator, like for example :=. So the first line of the block may be interpreted with an imperative mindset in the following way:

The getArgs action is another effectful constant that returns the command-line arguments of the invoked program, and it can be found in the System.Environment module. Modules can be imported by the import keyword in the beginning of every Haskell source file or in GHCi:

import System.Environment

This assignment automatically creates a variable named args so we do not have to bother with declaring it — its type will be automatically inferred in the way that we have seen already before.

Note that pure functions cannot be added to such effectful blocks directly, as their type will not match. For that reason, there is a function, return that "lifts" pure functions into the context of some computation. (The name return is not a very good choice of name, some people say that lift or inject would have been a better one.)

In this type, the Monad describes the notion of effect, and return can be used to turn any pure value into an effectful one.

main :: IO ()
main = do
  args <- return []
  print args
  return ()

This version of main works with a (purely) constant list of arguments and it explicitly returns void. Note that the last line may be actually omitted as the print action itself returns a void value, that already satisfies our so-far-hidden expectation that the last expression in the block should be of type of that of the block.

Playing the game of binding a lifted pure value to a name looks awkward, so the next line in the original block has a new keyword, let that implements binding pure values to names, in a similar vein to where. In the block, let is used to create a fresh variable, xs that contains the integer version of the command-line arguments. The conversion itself is done by a list comprehension that iterates over the elements of the args list and maps each of them to an integer from a string. This is achieved by the polymorphic read function:

The read function is overloaded, and it works for any type that is a member of the Read type class: it takes a string and parses as it was a textual representation of some value of that type. Note that, being a polymorphic function in its return type, read always needs type information on its return value:

In GHCi, it works out badly, because due to the defaulting mechanism of overloaded functions, the return type was chosen to be (). But in compile time, this gives a type error message, as the value of the type variable cannot made monomorphic therefore it becomes ambiguous.

On that note, it is worthwhile to mention that the standard instances of Read are quite smart, so we can easily parse more complicated values like that one:

But it also possible to compile the full source down to a native binary with GHC itself. GHC has support for building multi-module programs as it can automatically discover the involved modules, given that they maintain a hierarchy like for Java programs: for example, the module named A.B.C must be in the A/B/C.hs file relative to the directory of the compiler's invocation.

The -main-is flag is used to tell the compiler that our main module is not called Main (which is assumed by default) but Tutorial.

Exercise. Add some basic error handling such as check for presence of command-line arguments.