What we are trying to do in the 8-Queens puzzle is find ways of placing 8 queens on an 8 by 8 chessboard so that no two queens are on the same row, the same column and the same diagonal (no two queens hold each other in check – no two queens attack each other) |

How many ways are there of placing the queens on the board? |

For the first queen, I have a choice of 8 x 8 = 64 cells where I can place it. For the next queen, I have one less choice, i.e. 63. And so on, until I get to the 8th queen, at which point I have only one choice left. |

And if we multiply those numbers, we find that there are more than 178 trillion ways of placing the queens. |

But we can arrive at the same queen configuration in different ways. If we consider a particular placement of the queens e.g. this one, how many way are there of arriving at it? |

When we pick the first queen, we have 8 choices. When we pick the second queen, we have one less choice, i.e. 7. And so on, until we pick the eighth queen, at which point we have only one choice left. |

And if we multiply those numbers, we find that there are more than (8!) 40,000 ways of arriving at the same queen configuration. |

So, if we want to work out the number of ways of placing the queens so that we arrive at a unique queen configuration, we divide the first number by the second number |

And if we do the math, we see that there are still more than 16 billion ways of arriving at a unique configuration. |

If we take into account that no two queens can be on the same row, then we can drastically reduce the number of candidate boards that we need to consider when looking for a solution. |

To do that, we start from the top row, and move downwards, placing one queen on each row. On each row, there are 8 ways of placing a queen. And if we do the maths, we find that there are 88 ways of doing that, i.e. still more than 16 million ways of doing that. |

We also know that no two queens can be on the same column. So we can again reduce the number of candidate boards we need to consider. |

To do that, we make sure that when placing a queen, we choose a column that has not already been picked. And if we do the math we see that there are (again that number 8!) more than 40,000 ways of doing that. And while we have not yet taken into account that no two queens can be on the same diagonal either, this does give us a good idea of where we are headed in terms of the candidate boards that need to be considered when looking for a solution (we have already greatly reduced the number of candidate boards by several orders of magnitude). |

Let’s try to find a solution to the puzzle. |

We place the first queen and cross out the rows, columns and diagonals which are no longer available for placing upcoming queens (because they would be held in check by the queen). |

We place the second queen, and again cross out. |

We place the third queen, and again cross out. |

We place the fourth queen, and again cross out. |

We place the fifth queen, and again cross out. |

But when we try to place the sixth queen, we are stuck. There is nowhere to place it because all the empty cells on the board are held in check by the previously placed queens. |

Let’s try again. |

We place the first queen, and cross out. |

We place the second queen, and cross out. |

We place the third queen, and again cross out. With a bit of luck, we'll find a solution. |

That’s it: we found a solution. |

The 8-Queens puzzle has always featured in the official Scala book and Scala FP course co-authored by Martin Odersky, because such combinatorial puzzles are particularly suited to solving with for expressions. |

The 8-Queens puzzle is a special case of the more general N-Queens puzzle, in which we have to place N-Queens on an N by N chessboard. Solutions exist for all positive integers N with the exception of 2 and 3. |

In the Scala book, a candidate solution is represented as a list of pairs, with each pair consisting of the row number and column number where a queen is located. The rows and columns start at 1. |

In the Scala FP course, the rows and columns start at 0. Because it is obvious that the Nth queen is in the Nth row, there is no need to be explicit about it, and so a solution is represented by a list of column numbers. |

In the Scala FP course, Odersky provides a ‘show’ function to display a candidate solution board on the console. |

Here is the code for the ‘show’ function. I have tweaked it slightly to use a cross for an empty cell and a crown for a populated cell. |

If we take the representation of a candidate solution board, e.g. this one… |

…and we pass it to the show function, and print the result… |

…we get this, which is pretty good. |

But now let’s look at how we can use a graphics library like Doodle to create an image of a candidate solution board. |

We can easily use Doodle to create a 100 pixel by 100 pixel red square image... |

...and a blue one. |

In Doodle there is a beside function, which takes two images and returns a composite image that is the first image beside the second image. |

Similarly, there is an above function, which takes two images and returns a composite image that is the first image above the second image. |

So, if we have a candidate solution board… |

…what we can do is create square images for each of the cells in the board, a white square for an empty cell, and a red square for a populated cell. |

And then we can use the beside and above functions to stitch the square images into a single composite image for the whole board. |

So, let’s create ourselves a white square image and a red square image… |

And now we can write ourselves a combine function which takes a grid of square images and returns an image that is the composition of all those images. |

The way it works is that it first maps over the rows of square images. |

For each row of images, it reduces the images with the beside function, stitching them together, one beside the other, into a single composite image for the whole row. |

Once it has done that, it takes the resulting images for the rows, and combines them with the above function, stitching them together, one above the other, into a single composite image for the whole board. |

Our combine function is using the reduce function, which is unsafe in that when reduce is invoked on an empty collection it throws an exception. |

To be on the safe side, let’s use the fold function instead of the reduce function. To do that, we must tell the fold function what to return when the collection is empty, so we tell it to return the empty image. |

Also, fold is just an alias for foldLeft, so let’s make that explicit. |

In this section of the talk, we need the notion of a monoid. For the purposes of this talk, a monoid consists of three things. The first thing is the set of elements of a given type, e.g., the positive integer numbers. The second thing is a binary associative operation on the elements of that type, e.g., integer addition. And the third thing is an identity element for the binary operation, e.g., zero. In the foldable trait, in the red book, we see a ‘concatenate’ function that operates on a foldable of A, and takes as argument a monoid for A. The ‘concatenate’ function does a left fold of the As, using the monoid’s identity as the initial accumulator, and using the monoid’s binary operation to combine all the foldable’s As into an A. |

In Cats, the ‘concatenate’ function is called ‘fold’, but there is also a ‘combineAll’ alias, to avoid clashes with the ‘fold’ function in the Scala standard library. |

Let’s skip the following slides, which are an introduction to (or a refresher on) monoids. |

Back to the code for our ‘combine’ function. It does a ‘foldLeft‘, with the empty image as initial accumulator and the ‘beside’ function as binary operation. |

What we can do is use Cats to define a monoid for images, whose identity is the empty image, and whose binary operation is the ‘beside’ function. For lack of imagination, let’s call this the ‘beside’ monoid. |

And now, instead of doing a left fold with the empty image and the ‘beside’ function, we can do a ‘combineAll’ with the ‘beside’ monoid. |

In the ‘foldable’ trait, we also see another function, called ‘foldMap’, which builds on top of the functionality provided by ‘concatenate’. It also operates on a ‘foldable’ of As, but it takes an additional parameter f, which is a function from A to B. Also, not that the monoid parameter is for B rather than for A. Logically, what ‘foldMap’ does is first map the As to Bs, and then proceed the same way that ‘concatenate’ does, i.e., by folding the elements using the monoid’s identity and binary operation. By the way, on this slide, ‘foldMap’ is implemented using ‘foldRight’, but it can just as well be implemented using ‘foldLeft’ (see next slide). |

Back to the code for our ‘combine’ function. |

Purely to make the next step easier to understand, let’s revert to using the unsafe ‘reduce’ function rather than ‘foldLeft’. |

Our ‘combine’ function is doing a map of a function, followed by a reduce with the ‘above’ function. A map followed by a reduce is exactly what ‘foldMap’ does (the function we have just seen). |

What we can do is use Cats to define another monoid for images. Its identity is still the empty image, but its binary operation is the ‘above’ function. Again, for lack of imagination, let’s call it the ‘above’ monoid. |

Now, instead of doing a map with a function, followed by a reduce with the ‘above’ function, we can do a ‘foldMap’ with the function and with the ‘above’ monoid. |

Here is a recap of the start and end points of our small refactoring journey. |

And it works, the refactored code produces exactly the same result (on this slide, I am using the ‘fold’ name rather than the ‘combineAll’ one). |

Now let’s go back to the show function… |

…and where it uses crosses and crowns, let’s use white square images, and red square images… |

…and where it stitches together smaller strings into larger strings, let’s put the square images into a grid, and pass them to our ‘combine’ function, so that instead of getting a string representation of a candidate solution board, we get a composite image for the board. |

We can do a bit more though. |

The images that we have been creating up to now have been centred on the origin of the coordinate system. |

In Doodle, if call an image’s ‘at’ function with the coordinates of a desired location, we get back the same image but centred on that location. And if we are prepared to do that, then we can further simplify how we combine images. |

Back to our combine function. |

We can now use Cats to define another image monoid, one whose identity is still the empty image, but whose binary operation is the ‘on’ function, which takes two images, and returns a composite image that is the first image superimposed on the second image. Again, for lack of imagination, let’s call this the ‘on’ monoid. |

Now instead of using the ‘beside’ and ‘above’ monoids, we can just use the ‘on’ monoid. |

And if we make the ‘on’ monoid implicit, i.e., if we say that the default monoid for images is the ‘on’ monoid, then we can simplify things further, because ‘foldMap’ and ‘combineAll’ are happy to take the ‘on’ monoid implicitly. |

To recap, here are the start and end points of our whole refactoring journey, and hopefully you’ll agree that the bottom function is simpler, provided we understand what ‘foldMap’ and ‘combineAll’ do, and what the ‘on’ monoid is. |

Earlier we looked at the number of ways of placing 8 queens on an 8 by 8 chessboard, one queen per row. We came across 88, which is the number of permutations of 8 queens, allowing repetition, and 8!, which is the number of permutations of 8 queens disallowing repetitions. |

So, for arbitrary N, the formulas are N to the power of N and N! |

But how do we compute the permutations of N queens, e.g. for N=4? |

We can use this simple program. The for expression first chooses a column number, between 1 and 4, for the first queen. Then it does the same for the 2nd, 3rd and 4th queen. It then puts the results in a List, which it then yields. |

And if we run the code, we find that it does indeed generate the list of permutations of column numbers 1 through to 4, allowing repetitions (you can see repeating column numbers in some permutations). |

And if we were to print some stats, we’d see that the number of permutations is indeed 4 to the power of 4, i.e. 256 (as opposed to N! = 24). |

And if we create the appropriate square images and pass them to our combine function, then here are the images that it produces for the 256 permutations. |

But how many of these are solutions? |

That depends on N. For N=4, there are 2 solutions. |

Here they are, highlighted in green. |

But how do we find the solutions? |

Martin Odersky suggests using recursion, so let’s refactor our permutations function to use recursion. |

Here is the function again. |

We are going to simplify it so that it does divide and conquer, using recursion. |

To compute the permutations of N queens, we first recursively compute the permutations of N-1 queens. |

This recursive call results in further recursive calls, eventually reaching the base case call, which computes the permutations of zero queens, and which results in a single permutation that is empty. |

When we come out of a recursive call, we have a partial permutation, which we name ‘queens’. We then pick a new column number that we name ‘queen’. |

We then cons ‘queen’ onto ‘queens’, to create a more complete permutation. |

Our refactoring works: if we run the resulting, code we get exactly the same result. |

Here is the recursive code for the N-Queens puzzle, both from the book, and from the course. You can see that the two programs are pretty much the same, except that they both use an ‘isSafe’ function that we have not discussed yet. |

On the left we have the permutation function that we have just refactored to use recursion, and on the right, we have the heart of the recursive code for the N-Queens puzzle. You can see that the code on the left is hard-coded to work only for N=4, whereas the code on the right is more general in that it works for an arbitrary N. You can also see that the code on the right uses the ‘isSafe’ function that we have not discusses so far. |

For the purposes of this talk, the only thing that we are going to say about the ‘isSafe’ function is what it does: If you give it a partial queens permutation computed so far, and a proposed location (column number) for the next queen, it returns, yes or no, true or false, whether or not it is safe to place the next queen in that location. The way it does it is by checking whether the next queen is in the same row, column or diagonal as any of the queens in the partial permutation. |

Let’s refactor our permutations function to use the ‘isSafe’ function. Once we have chosen a column for the next queen, we call the ‘isSafe’ function to find out if it is safe to place the queen in that column, and only if it is safe, do we proceed in the for comprehension, otherwise we backtrack. |

And if we run the refactored code we find that the vast majority of the permutations are considered unsafe and are thus filtered out and discarded, which leaves us with just two permutations… |

…the two solution that we saw earlier. |

Let’s visualise how the filtering process reduces the problem space. |

We start off with an empty 4 by 4 board. |

We then place the first queen on the board. There are four ways of doing that. |

For each of the four resulting boards, we then place the second queen, and again there are four ways of doing that. The ‘isSafe’ function kicks in and identifies some boards as being unsafe, so I have marked them in yellow. |

For each of the remaining safe boards, the white ones, we then place the third queen. Again, there are four ways of doing that. I have again marked in yellow the new boards that the ‘isSafe’ function considers unsafe. I have also marked in orange some placeholders for boards that are not even generated, let alone checked for safety, because they originate from previous boards deemed unsafe. |

On the remaining safe boards (the white ones), we now place the fourth queen. Again, there are four ways of doing that, and again, new unsafe boards are marked in yellow, and new placeholders, for boards not generated, are in orange. We see that almost all generated boards are yellow, i.e. unsafe, and the remaining two, the solutions, are in green. So, the recursive algorithm generated 60 out of 256 possible boards, and found 2 solutions. |

Here are the two solutions again, the ones for N=4. |

When N=5 there are 10 solutions. |

When N=6 there are 4 solutions. |

When N=7 there are 40 solutions. |

When N=8 there are 92 solutions. |

You can see that the number of solutions varies quite a bit depending on the value of N. |

Here is the code for the recursive solution to the N-Queens puzzle, both in Scala and in Haskell, with some unimportant tweaking of the ‘isSafe’ function and its helper functions. |

I went to the Rosetta Code web site, looking for alternative Haskell solutions to the puzzle. I found one which did not use recursion. It used a function called ‘foldM’, which I had not came across before. The variable names that it used did not seem helpful to me. It was succinct. Also, it was heavily commented, suggesting that it was not easy to understand, at least by people new to the ‘foldM’ function. |

In the rest of this talk we will gain an understanding of the ‘foldM’ function, and see how it can be used to write an iterative solution to the puzzle. |

We are all very familiar with the three functions that are the bread, butter and jam of functional programming… |

…i.e. the triad of map, filter and fold. |

Let’s gain an understanding of the ‘foldM’ function by looking at the monadic variant of the triad, i.e. ‘mapM’, ‘filterM’ and ‘foldM’. |

Let’s start with ‘mapM’. |

I came across the function in this Haskell book. |

We are all very familiar with the map function. It takes a function from A to B, and a list of As, and maps it to a list of Bs. |

What we want to do is understand how ‘mapM’ works. We are going to look at the Scala code for it, which we can write ourselves using the Cats Monad type class. |

But before we do that, where does ‘mapM’ live in Haskell and in Scala? |

As Runar Bjarnason (co-author of the red book) says, in Scala, if the question is “how do I do this?”, then the answer is either “don’t do it”, or “traverse” (and its companion function, ‘sequence’). |

Let’s skip the following introduction (or refresher) to the ‘traverse’ function. |

In Haskell, the ‘traverse’ function lives in the ‘traversable’ type class. |

Let’s compare the signatures of the ‘map’ and ‘mapM’ functions. |

While ‘map’ takes a function that returns a B, ‘mapM’ takes a function that also returns a B, but in a monadic context. |

And while ‘map’ returns a list of Bs, ‘mapM’ also returns a list of Bs, but in a monadic context. |

It turns out that in Haskell, ‘mapM’ is defined as ‘traverse’. While ‘traverse’ is based on the Applicative type class, whereas ‘mapM’ is based on the Monad type class, it is possible to define ‘mapM’ as ‘traverse’ because every Monad is also an Applicative. |

In Cats, ‘traverse’ is also defined in Traverse (aka Traversable). |

Using Cats, we can write ourselves some boilerplate Scala code (which you can ignore) to define ‘mapM’ in terms of ‘traverse’, for convenience. |

I am not really that surprised that ‘mapM’ is just ‘traverse’, because in the red book, I have already had my mind blown twice, when I learned that ‘traverse’ is so generic, that it can be used to define both the ‘map’ function, and the ‘foldMap’ function that we saw earlier. |

Let’s skip the following slides on the symmetry that exists in the interrelation of flatMap/foldMap/traverse and flatten/fold/sequence. |

Let’s look at an example of using ‘mapM’ when the monadic context is Option, i.e. ‘mapM’ may or may not return a result. |

If we look at the Scala code at the bottom of the slide, we see a function ‘convert’ that takes a char and, if it is a digit, converts it to an Int, so it returns an Optional Int. In the first assertion, one of the chars is not a digit, so at some point ‘convert’ returns a None, and so ‘mapM’ short-circuits, and also returns a None. In the second assertion, all chars are digits, so ‘convert’ always returns a Some of an Int, and ‘mapM’, which is just ‘traverse’, i.e. it turns a List of Some of Int values inside out, into Some List of Int values. |

Now let’s look at an example of using ‘mapM’ in which the monadic context is a List, i.e. ‘mapM’ returns zero, one or more results. |

We are all very familiar with the map function. If we take a list of strings “X” and “Y” and map over it this function, then the result is quite obvious… |

First “X” is combined, first with “1”… |

…and then with “2”… |

…then “Y” is combined, first with “1”… |

…and then with “2”. |

But what happens when we use ‘mapM’ instead of ‘map’? |

We are calling ‘mapM’ on List(“X”, “Y”), at the head of which is “X”… |

… so “X” is passed to f, which returns List(“X1”,”X2”)… |

…and in the for comprehension we choose the first element, i.e. “X1”… |

…and then, in the for comprehension, we make a recursive call to ‘mapM’… |

…which recursive call is on the tail of the list, at the head of which is “Y1”… |

…so “Y1” is passed to f, which returns List(“Y1”,”Y2”)… |

…and in the for comprehension we choose the first element, i.e. “Y1”… |

…and then, in the for comprehension, we make a recursive call to ‘mapM’… |

…which recursive call is the base case, so it returns a singleton List containing an empty list… |

…so we come out of that recursive call… |

…and in the for comprehension, we choose the first element, i.e. the empty List… |

…and so the for comprehension conses “Y1” onto the empty list and yields the result… |

…now in the for comprehension we switch to the second element, i.e. “Y2” and make a recursive call to ‘mapM’… |

… which recursive call is again the base case… |

…so when we come out of the recursive call, the result is the same singleton List containing an empty list… |

…so in the for comprehension, we again choose the first element of the result list, i.e. an empty list… |

…and the for comprehension then yields the result of consing “Y2” onto the empty list… |

…we then come out of the first recursive call, and the result is List(List("Y1"),List("Y2"))… |

…so in the for comprehension, we choose the first element of that result list, i.e. List(“Y1”)… |

… and so, the for comprehension then conses “X1” onto List(“Y1”)), and this becomes the first value in our result. Now that you have seen how the sausage is made, I am just going to fast-forward through the rest of the computation. |

So, the result that we get when we use ‘mapM’, is quite different from the result that we got when using ‘map’. You can clearly see the effects of the List monad being hard at work combining things in as many different ways as possible. |

I also have an example of using ‘mapM’ when the monadic context is the IO monad, but we don’t have time to go through it, so please look at it in your own time. |

That’s ‘mapM’ introduced. |

Now let’s introduce ‘filterM’. |

I came across it in the same Haskell book. |

What we want to do is understand how ‘filterM’works. We are going to look at the Scala code for it, which we can write ourselves using the Cats Monad type class. |

Actually, to help understand how ‘filterM’works, let’s compare its code with that of ‘mapM’. |

While ‘mapM’ takes a function f from A to B in a monadic context… |

…the function passed to ‘filterM’ is called p, for predicate, and while it also takes an A, it returns a Boolean in a monadic context… |

…and where ‘mapM’ obtains a B from the call to ‘f’, and a list of Bs from the recursive call to ‘mapM’, and yields the result of consing the B onto the Bs… |

…‘filterM’ extracts from the call to ‘p’, a flag that tells it whether or not to retain the current element, and extracts from the recursive call to ‘filterM’ the list of elements retained so far, and so what ‘filterM’ does next is say: if I am keeping the current element, then I am returning the result of consing that element onto the list of elements retained so far, otherwise I am just returning the list of elements retained so far. |

So where does ‘filterM’ live in Scala? |

In Cats, in TraverseFilter, we can find a ‘filterA’ function that is based on Applicative. |

Using Cats, we can write ourselves some boilerplate code (which you can ignore), to define ‘filterM’ in terms of ‘filterA, for convenience. |

And now let’s look at an example of using ‘filterM’ when the monadic context is Option. |

In the bottom left corner of the slide, we see the Scala code for the ‘mapM’ example that we saw earlier… |

Let’s adapt that code so that it serves as an example of using ‘filterM’. In the bottom right corner we see a function called ‘maybeIsEven’, which takes a char and, if it is a digit, converts it to an integer, and then checks whether or not the integer is even, so it returns an Optional Boolean. In the first assertion, one of the chars is not a digit, so at some point ‘maybeIsEven’ returns a None, and so ‘filterM’ short-circuits, and also returns a None. In the second assertion, all chars are digits, so ‘maybeIsEven’ always returns a Some of a flag: Some of false, Some of true, Some of false, Some of true, so what ‘filterM’ does is peer into the Optional values to access the flags and determine which elements of the List to keep. |

Now let’s look at an example of using ‘filterM’ where the monadic context is a List. |

It turns out that if you pass ‘filterM’ a function that ignores its parameter and always returns a list containing both true and false (in other words, I want to eat my cake, and eat it too: I want to both keep the current element and not keep it), then ‘filterM’ returns the list of all the sub-lists of the input list. i.e. it computes the powerset function: the set of all subsets of a set. |

Let’s see an example. If we filter the empty list, then that is the base case… |

…the powerset of the empty list contains just the empty list itself… |

If we filter a singleton list containing the element 3, then ‘filterM’ builds upon the elements of the powerset for the empty list (i.e. just the empty list), by both consing 3 onto them… |

…and not consing 3 onto them… |

… which results in a new powerset containing two lists… |

…and if we filter a list with one more element, e.g. 2, then ‘filterM’ builds upon the elements of the powerset for the singleton list, by both consing 2 onto them… |

…and not consing 2 onto them… |

… which results in a new powerset containing four lists… |

…and so on, and yes, we can see that when we filter a list with elements 1, 2 and 3, ‘filterM’ does indeed compute the powerset for that list. |

That’s ‘filterM’ introduced. |

Now let’s introduce ‘foldM’. |

We are all very familiar with the left fold, which applies a binary operator to an initial value and all the elements of a collection, going from left to right. |

Here is a simple example in which we fold a list of integers using zero as the initial value and integer addition as the binary operator, so the result is 14, i.e. the sum of the integers. |

We want to understand how ‘foldM’ works, so let’s compare the signature of a left fold with that of ‘foldM’. While the binary function passed to a left fold returns an A, the binary function passed to ‘foldM’ also returns an A, but in a monadic context… |

While the overall result of a left fold is an A, the overall result of ‘foldM’ is also an A, but in a monadic context. |

In Scala, we can find ‘foldM’ in Cats, in the Foldable type class. |

Let’s look at an example of using ‘foldM’ where the monadic context is Option, i.e. ‘foldM’ may or may not return a result. |

If we look at the bottom of the slide, at the Scala code, we see a ‘binSmalls’ function that takes an accumulator and a current element, both integers. If the current element is too large, ‘binSmalls’ returns None, otherwise it adds the current element to the accumulator, and returns the result in a Some. In the first assertion, one of the integers in the list is too large, so at some point, ‘binSmalls’ returns None, and so ‘foldM’ short-circuits and also returns None. In the second assertion, all the elements in the list are small, so ‘binSmalls’ always returns a Some of an integer, and so ‘foldM’, which is passed an initial accumulator of zero, peers into the Optional values, to obtain the integer elements of the list, and add them to the accumulator, thus summing up the integers, and returning the same value that a left fold would compute, i.e. 14, but wrapped in a Some. |

That example, in which the function passed to ‘foldM’ returns an Optional value, is easy to understand. It gets harder to understand, when the function passed to ‘foldM’ returns a list of values. But that is exactly how we need to use ‘foldM’ in order to solve the N-Queens puzzle, so in the upcoming slides, we are going to look at three examples that do just that. Before we do that though, let’s take another look at the definition of ‘foldM’. |

Let’s look at the pseudocode in the central rectangle, the one with the bright white background. The ‘foldM’ function takes a binary function that returns its result in a monadic context, an initial accumulator, and a list of elements. What it does is repeatedly invoke the binary function with the current accumulator and the current list element, each time extracting a new accumulator value from the resulting monadic context. It keeps doing that until it gets to the last list element, at which point it invokes the binary function one last time, and this time simply returns the function’s result, which is in a monadic context, since that is what ‘foldM’ does: it returns a result in a monadic context. |

Lets’ skip these slides. |

Now let’s look at an example of using ‘foldM’ where the monadic context is a List. The thing to focus on, in order to understand the upcoming slides, is that while an ordinary fold manages one accumulator, in this example, ‘foldM’ manages zero, one, or more accumulators. |

Let’s skip these slides. |

In our first example, we are going to pass ‘foldM’ a very unlikely function which ignores its two parameters, the accumulator and the current element, and always returns a list with two accumulators, both zero. When we fold an empty list, ‘foldM’ just returns the initial accumulator in a list (in this example the accumulator happens to be 9). |

When we fold a non-empty list, ‘foldM’ invokes the given function with each of the current accumulators. When the list is a singleton, ‘foldM’ calls the given function once (with the initial accumulator), which results in a list of two accumulators. |

When we fold a list with one more element, ‘foldM’ builds on those two accumulators by invoking the given function once for each of the accumulators, which results in two lists of two accumulators, which get flatMapped into a list of four accumulators. |

When we fold a list with one more element, ‘foldM’ builds on those four accumulators by invoking the given function once for each of the accumulators, which results in four lists of two accumulators, which get flatMapped into a list of eight accumulators. |

And so on. As you can see, the number of accumulators doubles with each iteration, so the number of accumulators in the result list is two to the power of the length of the list being folded. |

Let’s skip these slides. |

In the second example, we change the function that we pass to ‘foldM’ so that it is a bit more realistic, a bit more useful. Instead of ignoring its accumulator and current element parameters, the function both adds the current element to the accumulator and subtracts it from the accumulator, returning a two-element list containing both the result of the addition and the result of the subtraction. |

Here is how the computation unfolds, visually. You can see that ‘foldM’ is using the given function to add/subtract the current element to/from the accumulator. You can also see that just like in the previous example, the number of accumulators doubles with each iteration. By the way, the first and last of the values in the result list are the values that would be computed by a plain left fold, using addition and subtraction respectively, as the binary operation. |

Let’s skip these slides. |

In our third example, we change the type of the accumulator from integer to list, and we change the function passed to ‘foldM’ so that rather than adding and subtracting the current element from the accumulator, it both conses the current element onto the accumulator, and refrains from consing the element onto the accumulator. |

Here is how the computation unfolds visually. You can see that in each iteration, the function is both consing, and refraining from consing, the current element onto the current accumulators. As in the previous two examples, the number of accumulators is again doubling with each iteration. |

It turns out that given this particular function, ‘foldM’ computes the powerset of the input list. So we found another way of computing the powerset. |

Let’s skip this slide. |

That’s ‘foldM’ introduced. |

Now that we understand how ‘foldM’ works, let’s see how we can use it to solve the N-Queens combinatorial puzzle. |

We are going to use ‘foldM’ to generate the permutations of a list of integers, namely the column numbers where we can try to place a queen. |

We already have a function that computes the permutations of columns 1,2,3,4. But this function allows repetition (i.e. repeating column numbers - multiple queens allowed in the same column). |

But what we want to compute are the permutations without allowing repetition. i.e. if we have numbers 1,2,…N, we want to compute N! (N factorial) permutations, e.g. for N=3, we want to compute 3! = 3 x 2 x 1 = 6 permutations. Notice how no digit is ever repeated in a permutation. |

We can compute such permutations by passing to ‘foldM’ this update function. The function takes two parameters. The second parameter, i.e. the current element of the list, is ignored. The first argument, i.e. the accumulator, is a pair consisting of (1) a partial permutation computed so far, and (2) a list of choices still available. For each of the available choices, the update function generates a new accumulator whose permutation is arrived at by consing the selected choice onto the partial permutation, and whose choices are arrived at by removing the selected choice from the available choices. Notice how that means that each time the update function is invoked, a partial permutation is grown into one or more fuller permutations, and a list of available choices is shrunk into one or more lists of choices with one less available choice. |

And that works: if we pass the update function to ‘foldM’, with an initial accumulator consisting of the empty permutation as the initial partial permutation, and the list of column numbers to be permuted as the initial list of choices, and if we then specify this same list as the input list to be folded, then ‘foldM’ does indeed compute the permutations. Or rather, it computes a list of pairs, each of which contains one of the permutations, together with some noise in the form of an empty list (we will soon see how to deal with that). Also, notice that the actual content of the input list doesn’t matter, only its length does, as it determines the number of iterations performed by ‘foldM’. For convenience, we use as an input list the same list whose items we want to permute, e.g. 1,2,3. |

Here is how the computation unfold visually. You can see that this time, the number of accumulators does not double every time. Rather, it grows in a way that reflects the computation of N!. In each iteration ‘foldM’ has to choose an integer for each of the current accumulators. In the first iteration, there is one accumulator with three integers to choose from, so ‘foldM’ generated 3 x 1 = 3 accumulators. In the next iteration, for each of those three accumulators, ‘foldM’ has to choose an integer, with two integers to choose from, so ‘foldM’ generated 3 x 2 = 6 accumulators. In the last iteration, for each of those six accumulators, ‘foldM’ has to choose an integer, with only one choice available, so ‘foldM’ generated 6 x 1 = 6 accumulators. So the number of accumulators is 3! = 3 x 2 x 1 = 6. |

As mentioned before, while the result returned by ‘foldM’ includes the desired permutations, it also contains some noise in that each permutation is part of a pair that also contains an empty list. Let’s skip the following slides about the boilerplate code needed to strip out the noise. |

Our domain is the N-Queens puzzle, so let’s do some renaming to reflect that. Let’s rename the ‘update’ function to ‘oneMoreQueen’. Let’s also rename ‘permutation’, the partial permutation generated so far, to ‘queens’, the columns in which we have already placed queens. And let’s rename ‘choices’, the elements which we are still able to pick in the process of creating a permutation, to ‘emptyColumns’, the columns where we still have the choice of trying to place a queen. |

While we are able to use ‘foldM’ to generate permutations without repetition, we are only interested in safe permutations, so let’s use the ‘isSafe’ function to filter out unsafe permutations. |

Here is how we invoked ‘foldM’ earlier, when generating the permutations for 1,2,3. |

But what we need to do now is write a function ‘queens’ that takes integer N, and returns the safe permutations for 1,2,…,N, e.g. for N=4 there are two such permutations. |

So given that we have renamed ‘update’ to ‘oneMoreQueen’, is how we now need to invoke ‘foldM’. |

And finally, as mentioned before, we need to strip out noise from the result of ‘foldM’, so let’s do that by mapping over the result with a function that replaces each pair with its head, which is a permutation, so that the result is mapped to a list of permutations. |

And we are done. Here is the code, both in Haskell and in Scala, for the iterative solution to the N-Queens puzzle, using ‘foldM’. Also included, for completeness, is the ‘isSafe’ function, which we are not going to discuss any further. |

We have gone from a recursive algorithm to an iterative one. |

Here is the recursive code for both Scala and Haskell (ignore the fact that I have refactored the ‘isSafe’ function and its helper functions). |

Here is the transition we have gone through in Haskell. |

And here is the original Haskell code, from the Rosetta Code site, which gave me the idea for this talk. You can see that it includes quite a few comments, because understanding it is hard without comments. It is still not easy, even with comments, especially for people who are new to ‘foldM’. Below is our current version of the code. I have omitted any comments because we are now very familiar with its workings. I believe our renaming of variables in ‘oneMoreQueen’ makes the latter easier to understand. |

Here is the transition we have gone through in Scala. |

And yes, it works. |

It works in Haskell… |

…and it works in Scala. |

To conclude this talk, I want to point out that one way in which the iterative algorithm differs from the recursive one, results in the iterative algorithm doing less work. Because the recursive algorithm does not remember where it has already placed previous queens, it will happily generate, and then test for safety, a candidate board that is obviously unsafe, because it contains two queens in the same column. The iterative algorithm, on the other hand, does remember where it has already placed previous queens. The number of columns still available in the ‘emptyColumns’ argument of the ‘oneMoreQueens’ function, shrinks over time to reflect attempts to place queens in available columns. |

Let's skip these slides. |

Let’s concentrate on the right-hand side of this table. It shows that when N=4, the recursive algorithm generates 60 boards, of which 16 are candidate boards, i.e. they contain the requisite number of queens (4 in this case). The iterative algorithm, on the other hand, generates 32 boards (about half the number), of which 4 are candidate boards (a quarter). So, the iterative algorithm does less work. |

This difference persists and widens as N grows. For N=8, for example, the recursive algorithm generates over 15,718 boards, whereas the iterative algorithm generates 5,506 (about a third). As for candidate boards, the recursive algorithm generates 2,495 boards, whereas the iterative one generates 312 (almost a tenth). |

Let’s skip this slide. |

Here is how the number of boards generated by the two algorithm compares as N grows from 4 to 8. |

And here is the same illustration but for candidate boards. |

That’s all folks. |