N-Queens Combinatorial Puzzle meets Cats - All Slides - With Transcript

#1
slide 1
#2
slide 2
#3
slide 3
  • What we are trying to do in the 8-Queens puzzle is find ways of placing 8 queens on an 8 by 8 chessboard so that no two queens are on the same row, the same column and the same diagonal (no two queens hold each other in check – no two queens attack each other).
#4
slide 4
  • How many ways are there of placing the queens on the board?
#5
slide 5
  • For the first queen, I have a choice of 8 x 8 = 64 cells where I can place it.
  • For the next queen, I have one less choice, i.e. 63.
  • And so on, until I get to the 8th queen, at which point I have only one choice left.
#6
slide 6
  • And if we multiply those numbers, we find that there are more than 178 trillion ways of placing the queens.
#7
slide 7
  • But we can arrive at the same queen configuration in different ways.
  • If we consider a particular placement of the queens e.g. this one, how many way are there of arriving at it ?
#8
slide 8
  • When we pick the first queen, we have 8 choices.
  • When we pick the second queen, we have one less choice, i.e. 7.
  • And so on, until we pick the eighth queen, at which point we have only one choice left.
#9
slide 9
  • And if we multiply those numbers, we find that there are more than (8!) 40,000 ways of arriving at the same queen configuration.
#10
slide 10
  • So, if we want to work out the number of ways of placing the queens so that we arrive at a unique queen configuration, we divide the first number by the second number
#11
slide 11
  • And if we do the math, we see that there are still more than 16 billion ways of arriving at a unique configuration.
#12
slide 12
  • If we take into account that no two queens can be on the same row, then we can drastically reduce the number of candidate boards that we need to consider when looking for a solution.
#13
slide 13
  • To do that, we start from the top row, and move downwards, placing one queen on each row. On each row, there are 8 ways of placing a queen.
  • And if we do the maths, we find that there are 88 ways of doing that, i.e. still more than 16 million ways of doing that.
#14
slide 14
  • We also know that no two queens can be on the same column. So we can again reduce the number of candidate boards we need to consider.
#15
slide 15
  • To do that, we make sure that when placing a queen, we choose a column that has not already been picked.
  • And if we do the math we see that there are (again that number 8!) more than 40,000 ways of doing that.
  • And while we have not yet taken into account that no two queens can be on the same diagonal either, this does give us a good idea of where we are headed in terms of the candidate boards that need to be considered when looking for a solution (we have already greatly reduced the number of candidate boards by several orders of magnitude).
#16
slide 16
  • Let’s try to find a solution to the puzzle.
#17
slide 17
#18
slide 18
  • We place the first queen and cross out the rows, columns and diagonals which are no longer available for placing upcoming queens (because they would be held in check by the queen).
#19
slide 19
  • We place the second queen, and again cross out.
#20
slide 20
  • We place the third queen, and again cross out.
#21
slide 21
  • We place the fourth queen, and again cross out.
#22
slide 22
  • We place the fifth queen, and again cross out.
#23
slide 23
  • But when we try to place the sixth queen, we are stuck.
  • There is nowhere to place it because all the empty cells on the board are held in check by the previously placed queens.
#24
slide 24
  • Let’s try again.
#25
slide 25
#26
slide 26
  • We place the first queen, and cross out.
#27
slide 27
  • We place the second queen, and cross out.
#28
slide 28
  • We place the third queen, and again cross out.
  • With a bit of luck, we'll find a solution.
#29
slide 29
#30
slide 30
#31
slide 31
#32
slide 32
#33
slide 33
#34
slide 34
#35
slide 35
  • The 8-Queens puzzle has always featured in the official Scala book and Scala FP course co-authored by Martin Odersky, because such combinatorial puzzles are particularly suited to solving with for expressions.
#36
slide 36
  • The 8-Queens puzzle is a special case of the more general N-Queens puzzle, in which we have to place N-Queens on an N by N chessboard. Solutions exist for all positive integers N with the exception of 2 and 3.
#37
slide 37
  • In the Scala book, a candidate solution is represented as a list of pairs, with each pair consisting of the row number and column number where a queen is located. The rows and columns start at 1.
#38
slide 38
  • In the Scala FP course, the rows and columns start at 0. Because it is obvious that the Nth queen is in the Nth row, there is no need to be explicit about it, and so a solution is represented by a list of column numbers.
#39
slide 39
  • In the Scala FP course, Odersky provides a ‘show’ function to display a candidate solution board on the console.
#40
slide 40
  • Here is the code for the ‘show’ function. I have tweaked it slightly to use a cross for an empty cell and a crown for a populated cell.
#41
slide 41
  • If we take the representation of a candidate solution board, e.g. this one…
#42
slide 42
  • …and we pass it to the show function, and print the result…
#43
slide 43
  • …we get this, which is pretty good.
#44
slide 44
  • But now let’s look at how we can use a graphics library like Doodle to create an image of a candidate solution board.
#45
slide 45
  • We can easily use Doodle to create a 100 pixel by 100 pixel red square image...
#46
slide 46
  • ...and a blue one.
#47
slide 47
  • In Doodle there is a beside function, which takes two images and returns a composite image that is the first image beside the second image.
#48
slide 48
  • Similarly, there is an above function, which takes two images and returns a composite image that is the first image above the second image.
#49
slide 49
  • So, if we have a candidate solution board…
#50
slide 50
  • …what we can do is create square images for each of the cells in the board, a white square for an empty cell, and a red square for a populated cell.
#51
slide 51
  • And then we can use the beside and above functions to stitch the square images into a single composite image for the whole board.
#52
slide 52
  • So, let’s create ourselves a white square image and a red square image…
#53
slide 53
  • And now we can write ourselves a combine function which takes a grid of square images and returns an image that is the composition of all those images.
#54
slide 54
  • The way it works is that it first maps over the rows of square images.
#55
slide 55
  • For each row of images, it reduces the images with the beside function, stitching them together, one beside the other, into a single composite image for the whole row.
#56
slide 56
  • Once it has done that, it takes the resulting images for the rows, and combines them with the above function, stitching them together, one above the other, into a single composite image for the whole board.
#57
slide 57
  • Our combine function is using the reduce function, which is unsafe in that when reduce is invoked on an empty collection it throws an exception.
#58
slide 58
  • To be on the safe side, let’s use the fold function instead of the reduce function. To do that, we must tell the fold function what to return when the collection is empty, so we tell it to return the empty image.
#59
slide 59
  • Also, fold is just an alias for foldLeft, so let’s make that explicit.
#60
slide 60
  • In this section of the talk, we need the notion of a monoid. For the purposes of this talk, a monoid consists of three things. The first thing is the set of elements of a given type, e.g., the positive integer numbers. The second thing is a binary associative operation on the elements of that type, e.g., integer addition. And the third thing is an identity element for the binary operation, e.g., zero. In the foldable trait, in the red book, we see a ‘concatenate’ function that operates on a foldable of A, and takes as argument a monoid for A. The ‘concatenate’ function does a left fold of the As, using the monoid’s identity as the initial accumulator, and using the monoid’s binary operation to combine all the foldable’s As into an A.
#61
slide 61
  • In Cats, the ‘concatenate’ function is called ‘fold’, but there is also a ‘combineAll’ alias, to avoid clashes with the ‘fold’ function in the Scala standard library.
#62
slide 62
  • Let’s skip the following slides, which are an introduction to (or a refresher on) monoids.
#63
slide 63
#64
slide 64
#65
slide 65
#66
slide 66
#67
slide 67
#68
slide 68
#69
slide 69
#70
slide 70
#71
slide 71
#72
slide 72
#73
slide 73
#74
slide 74
#75
slide 75
  • Back to the code for our ‘combine’ function. It does a ‘foldLeft‘, with the empty image as initial accumulator and the ‘beside’ function as binary operation.
#76
slide 76
  • What we can do is use Cats to define a monoid for images, whose identity is the empty image, and whose binary operation is the ‘beside’ function. For lack of imagination, let’s call this the ‘beside’ monoid.
#77
slide 77
  • And now, instead of doing a left fold with the empty image and the ‘beside’ function, we can do a ‘combineAll’ with the ‘beside’ monoid.
#78
slide 78
  • In the ‘foldable’ trait, we also see another function, called ‘foldMap’, which builds on top of the functionality provided by ‘concatenate’. It also operates on a ‘foldable’ of As, but it takes an additional parameter f, which is a function from A to B. Also, not that the monoid parameter is for B rather than for A. Logically, what ‘foldMap’ does is first map the As to Bs, and then proceed the same way that ‘concatenate’ does, i.e., by folding the elements using the monoid’s identity and binary operation. By the way, on this slide, ‘foldMap’ is implemented using ‘foldRight’, but it can just as well be implemented using ‘foldLeft’ (see next slide).
#79
slide 79
#80
slide 80
  • Back to the code for our ‘combine’ function.
#81
slide 81
  • Purely to make the next step easier to understand, let’s revert to using the unsafe ‘reduce’ function rather than ‘foldLeft’.
#82
slide 82
#83
slide 83
  • Our ‘combine’ function is doing a map of a function, followed by a reduce with the ‘above’ function.
  • A map followed by a reduce is exactly what ‘foldMap’ does (the function we have just seen).
#84
slide 84
  • What we can do is use Cats to define another monoid for images.
  • Its identity is still the empty image, but its binary operation is the ‘above’ function.
  • Again, for lack of imagination, let’s call it the ‘above’ monoid.
#85
slide 85
  • Now, instead of doing a map with a function, followed by a reduce with the ‘above’ function, we can do a ‘foldMap’ with the function and with the ‘above’ monoid.
#86
slide 86
  • Here is a recap of the start and end points of our small refactoring journey.
#87
slide 87
  • And it works, the refactored code produces exactly the same result (on this slide, I am using the ‘fold’ name rather than the ‘combineAll’ one).
#88
slide 88
  • Now let’s go back to the show function…
#89
slide 89
  • …and where it uses crosses and crowns, let’s use white square images, and red square images…
#90
slide 90
  • …and where it stitches together smaller strings into larger strings, let’s put the square images into a grid, and pass them to our ‘combine’ function, so that instead of getting a string representation of a candidate solution board, we get a composite image for the board.
#91
slide 91
#92
slide 92
  • We can do a bit more though.
#93
slide 93
  • The images that we have been creating up to now have been centred on the origin of the coordinate system.
#94
slide 94
  • In Doodle, if call an image’s ‘at’ function with the coordinates of a desired location, we get back the same image but centred on that location. And if we are prepared to do that, then we can further simplify how we combine images.
#95
slide 95
  • Back to our combine function.
#96
slide 96
  • We can now use Cats to define another image monoid, one whose identity is still the empty image, but whose binary operation is the ‘on’ function, which takes two images, and returns a composite image that is the first image superimposed on the second image.
  • Again, for lack of imagination, let’s call this the ‘on’ monoid.
#97
slide 97
  • Now instead of using the ‘beside’ and ‘above’ monoids, we can just use the ‘on’ monoid.
#98
slide 98
#99
slide 99
  • And if we make the ‘on’ monoid implicit, i.e., if we say that the default monoid for images is the ‘on’ monoid, then we can simplify things further, because ‘foldMap’ and ‘combineAll’ are happy to take the ‘on’ monoid implicitly.
#100
slide 100
  • To recap, here are the start and end points of our whole refactoring journey, and hopefully you’ll agree that the bottom function is simpler, provided we understand what ‘foldMap’ and ‘combineAll’ do, and what the ‘on’ monoid is.
#101
slide 101
#102
slide 102
  • Earlier we looked at the number of ways of placing 8 queens on an 8 by 8 chessboard, one queen per row. We came across 88, which is the number of permutations of 8 queens, allowing repetition, and 8!, which is the number of permutations of 8 queens disallowing repetitions.
#103
slide 103
  • So, for arbitrary N, the formulas are N to the power of N and N!
#104
slide 104
  • But how do we compute the permutations of N queens, e.g. for N=4?
#105
slide 105
  • We can use this simple program.
  • The for expression first chooses a column number, between 1 and 4, for the first queen.
  • Then it does the same for the 2nd, 3rd and 4th queen. It then puts the results in a List, which it then yields.
#106
slide 106
  • And if we run the code, we find that it does indeed generate the list of permutations of column numbers 1 through to 4, allowing repetitions (you can see repeating column numbers in some permutations).
#107
slide 107
  • And if we were to print some stats, we’d see that the number of permutations is indeed 4 to the power of 4, i.e. 256 (as opposed to N! = 24).
#108
slide 108
  • And if we create the appropriate square images and pass them to our combine function, then here are the images that it produces for the 256 permutations.
#109
slide 109
  • But how many of these are solutions?
#110
slide 110
  • That depends on N. For N=4, there are 2 solutions.
#111
slide 111
  • Here they are, highlighted in green.
#112
slide 112
  • But how do we find the solutions?
#113
slide 113
  • Martin Odersky suggests using recursion, so let’s refactor our permutations function to use recursion.
#114
slide 114
#115
slide 115
#116
slide 116
  • Here is the function again.
#117
slide 117
  • We are going to simplify it so that it does divide and conquer, using recursion.
#118
slide 118
  • To compute the permutations of N queens, we first recursively compute the permutations of N-1 queens.
#119
slide 119
  • This recursive call results in further recursive calls, eventually reaching the base case call, which computes the permutations of zero queens, and which results in a single permutation that is empty.
#120
slide 120
  • When we come out of a recursive call, we have a partial permutation, which we name ‘queens’.
  • We then pick a new column number that we name ‘queen’.
#121
slide 121
  • We then cons ‘queen’ onto ‘queens’, to create a more complete permutation.
#122
slide 122
  • Our refactoring works: if we run the resulting, code we get exactly the same result.
#123
slide 123
  • Here is the recursive code for the N-Queens puzzle, both from the book, and from the course.
  • You can see that the two programs are pretty much the same, except that they both use an ‘isSafe’ function that we have not discussed yet.
#124
slide 124
  • On the left we have the permutation function that we have just refactored to use recursion, and on the right, we have the heart of the recursive code for the N-Queens puzzle.
  • You can see that the code on the left is hard-coded to work only for N=4, whereas the code on the right is more general in that it works for an arbitrary N.
  • You can also see that the code on the right uses the ‘isSafe’ function that we have not discusses so far.
#125
slide 125
  • For the purposes of this talk, the only thing that we are going to say about the ‘isSafe’ function is what it does: If you give it a partial queens permutation computed so far, and a proposed location (column number) for the next queen, it returns, yes or no, true or false, whether or not it is safe to place the next queen in that location.
  • The way it does it is by checking whether the next queen is in the same row, column or diagonal as any of the queens in the partial permutation.
#126
slide 126
  • Let’s refactor our permutations function to use the ‘isSafe’ function.
  • Once we have chosen a column for the next queen, we call the ‘isSafe’ function to find out if it is safe to place the queen in that column, and only if it is safe, do we proceed in the for comprehension, otherwise we backtrack.
#127
slide 127
#128
slide 128
  • And if we run the refactored code we find that the vast majority of the permutations are considered unsafe and are thus filtered out and discarded, which leaves us with just two permutations…
#129
slide 129
  • …the two solution that we saw earlier.
#130
slide 130
  • Let’s visualise how the filtering process reduces the problem space.
#131
slide 131
  • We start off with an empty 4 by 4 board.
#132
slide 132
  • We then place the first queen on the board. There are four ways of doing that.
#133
slide 133
  • For each of the four resulting boards, we then place the second queen, and again there are four ways of doing that.
  • The ‘isSafe’ function kicks in and identifies some boards as being unsafe, so I have marked them in yellow.
#134
slide 134
  • For each of the remaining safe boards, the white ones, we then place the third queen. Again, there are four ways of doing that.
  • I have again marked in yellow the new boards that the ‘isSafe’ function considers unsafe.
  • I have also marked in orange some placeholders for boards that are not even generated, let alone checked for safety, because they originate from previous boards deemed unsafe.
#135
slide 135
  • On the remaining safe boards (the white ones), we now place the fourth queen.
  • Again, there are four ways of doing that, and again, new unsafe boards are marked in yellow, and new placeholders, for boards not generated, are in orange.
  • We see that almost all generated boards are yellow, i.e. unsafe, and the remaining two, the solutions, are in green.
  • So, the recursive algorithm generated 60 out of 256 possible boards, and found 2 solutions.
#136
slide 136
  • Here are the two solutions again, the ones for N=4.
#137
slide 137
  • When N=5 there are 10 solutions.
#138
slide 138
  • When N=6 there are 4 solutions.
#139
slide 139
  • When N=7 there are 40 solutions.
#140
slide 140
  • When N=8 there are 92 solutions.
#141
slide 141
  • You can see that the number of solutions varies quite a bit depending on the value of N.
#142
slide 142
  • Here is the code for the recursive solution to the N-Queens puzzle, both in Scala and in Haskell, with some unimportant tweaking of the ‘isSafe’ function and its helper functions.
#143
slide 143
  • I went to the Rosetta Code web site, looking for alternative Haskell solutions to the puzzle.
  • I found one which did not use recursion. It used a function called ‘foldM’, which I had not came across before.
  • The variable names that it used did not seem helpful to me.
  • It was succinct.
  • Also, it was heavily commented, suggesting that it was not easy to understand, at least by people new to the ‘foldM’ function.
#144
slide 144
  • In the rest of this talk we will gain an understanding of the ‘foldM’ function, and see how it can be used to write an iterative solution to the puzzle.
#145
slide 145
  • We are all very familiar with the three functions that are the bread, butter and jam of functional programming…
#146
slide 146
  • …i.e. the triad of map, filter and fold.
#147
slide 147
  • Let’s gain an understanding of the ‘foldM’ function by looking at the monadic variant of the triad, i.e. ‘mapM’, ‘filterM’ and ‘foldM’.
#148
slide 148
  • Let’s start with ‘mapM’.
#149
slide 149
  • I came across the function in this Haskell book.
#150
slide 150
  • We are all very familiar with the map function.
  • It takes a function from A to B, and a list of As, and maps it to a list of Bs.
#151
slide 151
  • What we want to do is understand how ‘mapM’ works.
  • We are going to look at the Scala code for it, which we can write ourselves using the Cats Monad type class.
#152
slide 152
#153
slide 153
  • But before we do that, where does ‘mapM’ live in Haskell and in Scala?
#154
slide 154
  • As Runar Bjarnason (co-author of the red book) says, in Scala, if the question is “how do I do this?”, then the answer is either “don’t do it”, or “traverse” (and its companion function, ‘sequence’).
#155
slide 155
  • Let’s skip the following introduction (or refresher) to the ‘traverse’ function.
#156
slide 156
#157
slide 157
#158
slide 158
#159
slide 159
#160
slide 160
#161
slide 161
#162
slide 162
#163
slide 163
#164
slide 164
#165
slide 165
#166
slide 166
#167
slide 167
#168
slide 168
  • In Haskell, the ‘traverse’ function lives in the ‘traversable’ type class.
#169
slide 169
  • Let’s compare the signatures of the ‘map’ and ‘mapM’ functions.
#170
slide 170
  • While ‘map’ takes a function that returns a B, ‘mapM’ takes a function that also returns a B, but in a monadic context.
#171
slide 171
  • And while ‘map’ returns a list of Bs, ‘mapM’ also returns a list of Bs, but in a monadic context.
#172
slide 172
  • It turns out that in Haskell, ‘mapM’ is defined as ‘traverse’.
  • While ‘traverse’ is based on the Applicative type class, whereas ‘mapM’ is based on the Monad type class, it is possible to define ‘mapM’ as ‘traverse’ because every Monad is also an Applicative.
#173
slide 173
  • In Cats, ‘traverse’ is also defined in Traverse (aka Traversable).
#174
slide 174
  • Using Cats, we can write ourselves some boilerplate Scala code (which you can ignore) to define ‘mapM’ in terms of ‘traverse’, for convenience.
#175
slide 175
  • I am not really that surprised that ‘mapM’ is just ‘traverse’, because in the red book, I have already had my mind blown twice, when I learned that ‘traverse’ is so generic, that it can be used to define both the ‘map’ function, and the ‘foldMap’ function that we saw earlier.
#176
slide 176
#177
slide 177
#178
slide 178
  • Let’s skip the following slides on the symmetry that exists in the interrelation of flatMap/foldMap/traverse and flatten/fold/sequence.
#179
slide 179
#180
slide 180
#181
slide 181
#182
slide 182
#183
slide 183
#184
slide 184
#185
slide 185
#186
slide 186
  • Let’s look at an example of using ‘mapM’ when the monadic context is Option, i.e. ‘mapM’ may or may not return a result.
#187
slide 187
#188
slide 188
  • If we look at the Scala code at the bottom of the slide, we see a function ‘convert’ that takes a char and, if it is a digit, converts it to an Int, so it returns an Optional Int.
  • In the first assertion, one of the chars is not a digit, so at some point ‘convert’ returns a None, and so ‘mapM’ short-circuits, and also returns a None.
  • In the second assertion, all chars are digits, so ‘convert’ always returns a Some of an Int, and ‘mapM’, which is just ‘traverse’, i.e. it turns a List of Some of Int values inside out, into Some List of Int values.
#189
slide 189
  • Now let’s look at an example of using ‘mapM’ in which the monadic context is a List, i.e. ‘mapM’ returns zero, one or more results.
#190
slide 190
#191
slide 191
  • We are all very familiar with the map function. If we take a list of strings “X” and “Y” and map over it this function, then the result is quite obvious…
#192
slide 192
  • First “X” is combined, first with “1”…
#193
slide 193
  • …and then with “2”…
#194
slide 194
  • …then “Y” is combined, first with “1”…
#195
slide 195
  • …and then with “2”.
#196
slide 196
  • But what happens when we use ‘mapM’ instead of ‘map’?
#197
slide 197
  • We are calling ‘mapM’ on List(“X”, “Y”), at the head of which is “X”…
#198
slide 198
  • … so “X” is passed to f, which returns List(“X1”,”X2”)…
#199
slide 199
  • …and in the for comprehension we choose the first element, i.e. “X1”…
#200
slide 200
  • …and then, in the for comprehension, we make a recursive call to ‘mapM’…
#201
slide 201
  • …which recursive call is on the tail of the list, at the head of which is “Y1”…
#202
slide 202
  • …so “Y1” is passed to f, which returns List(“Y1”,”Y2”)…
#203
slide 203
  • …and in the for comprehension we choose the first element, i.e. “Y1”…
#204
slide 204
  • …and then, in the for comprehension, we make a recursive call to ‘mapM’…
#205
slide 205
  • …which recursive call is the base case, so it returns a singleton List containing an empty list…
#206
slide 206
  • …so we come out of that recursive call…
#207
slide 207
  • …and in the for comprehension, we choose the first element, i.e. the empty List…
#208
slide 208
  • …and so the for comprehension conses “Y1” onto the empty list and yields the result…
#209
slide 209
  • …now in the for comprehension we switch to the second element, i.e. “Y2” and make a recursive call to ‘mapM’…
#210
slide 210
  • … which recursive call is again the base case…
#211
slide 211
  • …so when we come out of the recursive call, the result is the same singleton List containing an empty list…
#212
slide 212
  • …so in the for comprehension, we again choose the first element of the result list, i.e. an empty list…
#213
slide 213
  • …and the for comprehension then yields the result of consing “Y2” onto the empty list…
#214
slide 214
  • …we then come out of the first recursive call, and the result is List(List("Y1"),List("Y2"))…
#215
slide 215
  • …so in the for comprehension, we choose the first element of that result list, i.e. List(“Y1”)…
#216
slide 216
  • … and so, the for comprehension then conses “X1” onto List(“Y1”)), and this becomes the first value in our result.
  • Now that you have seen how the sausage is made, I am just going to fast-forward through the rest of the computation.
#217
slide 217
#218
slide 218
#219
slide 219
#220
slide 220
#221
slide 221
#222
slide 222
#223
slide 223
#224
slide 224
#225
slide 225
#226
slide 226
  • So, the result that we get when we use ‘mapM’, is quite different from the result that we got when using ‘map’. You can clearly see the effects of the List monad being hard at work combining things in as many different ways as possible.
#227
slide 227
  • I also have an example of using ‘mapM’ when the monadic context is the IO monad, but we don’t have time to go through it, so please look at it in your own time.
#228
slide 228
#229
slide 229
#230
slide 230
#231
slide 231
#232
slide 232
#233
slide 233
#234
slide 234
#235
slide 235
#236
slide 236
#237
slide 237
  • That’s ‘mapM’ introduced.
#238
slide 238
#239
slide 239
  • Now let’s introduce ‘filterM’.
#240
slide 240
  • I came across it in the same Haskell book.
#241
slide 241
  • What we want to do is understand how ‘filterM’works.
  • We are going to look at the Scala code for it, which we can write ourselves using the Cats Monad type class.
#242
slide 242
  • Actually, to help understand how ‘filterM’works, let’s compare its code with that of ‘mapM’.
#243
slide 243
  • While ‘mapM’ takes a function f from A to B in a monadic context…
#244
slide 244
  • …the function passed to ‘filterM’ is called p, for predicate, and while it also takes an A, it returns a Boolean in a monadic context…
#245
slide 245
  • …and where ‘mapM’ obtains a B from the call to ‘f’, and a list of Bs from the recursive call to ‘mapM’, and yields the result of consing the B onto the Bs…
#246
slide 246
  • …‘filterM’ extracts from the call to ‘p’, a flag that tells it whether or not to retain the current element, and extracts from the recursive call to ‘filterM’ the list of elements retained so far, and so what ‘filterM’ does next is say: if I am keeping the current element, then I am returning the result of consing that element onto the list of elements retained so far, otherwise I am just returning the list of elements retained so far.
#247
slide 247
  • So where does ‘filterM’ live in Scala?
#248
slide 248
  • In Cats, in TraverseFilter, we can find a ‘filterA’ function that is based on Applicative.
#249
slide 249
  • Using Cats, we can write ourselves some boilerplate code (which you can ignore), to define ‘filterM’ in terms of ‘filterA, for convenience.
#250
slide 250
  • And now let’s look at an example of using ‘filterM’ when the monadic context is Option.
#251
slide 251
#252
slide 252
  • In the bottom left corner of the slide, we see the Scala code for the ‘mapM’ example that we saw earlier…
#253
slide 253
  • Let’s adapt that code so that it serves as an example of using ‘filterM’.
  • In the bottom right corner we see a function called ‘maybeIsEven’, which takes a char and, if it is a digit, converts it to an integer, and then checks whether or not the integer is even, so it returns an Optional Boolean.
  • In the first assertion, one of the chars is not a digit, so at some point ‘maybeIsEven’ returns a None, and so ‘filterM’ short-circuits, and also returns a None.
  • In the second assertion, all chars are digits, so ‘maybeIsEven’ always returns a Some of a flag: Some of false, Some of true, Some of false, Some of true, so what ‘filterM’ does is peer into the Optional values to access the flags and determine which elements of the List to keep.
#254
slide 254
  • Now let’s look at an example of using ‘filterM’ where the monadic context is a List.
#255
slide 255
#256
slide 256
  • It turns out that if you pass ‘filterM’ a function that ignores its parameter and always returns a list containing both true and false (in other words, I want to eat my cake, and eat it too: I want to both keep the current element and not keep it), then ‘filterM’ returns the list of all the sub-lists of the input list. i.e. it computes the powerset function: the set of all subsets of a set.
#257
slide 257
  • Let’s see an example.
  • If we filter the empty list, then that is the base case…
#258
slide 258
  • …the powerset of the empty list contains just the empty list itself…
#259
slide 259
  • If we filter a singleton list containing the element 3, then ‘filterM’ builds upon the elements of the powerset for the empty list (i.e. just the empty list), by both consing 3 onto them…
#260
slide 260
  • …and not consing 3 onto them…
#261
slide 261
  • … which results in a new powerset containing two lists…
#262
slide 262
  • …and if we filter a list with one more element, e.g. 2, then ‘filterM’ builds upon the elements of the powerset for the singleton list, by both consing 2 onto them…
#263
slide 263
#264
slide 264
  • …and not consing 2 onto them…
#265
slide 265
#266
slide 266
  • … which results in a new powerset containing four lists…
#267
slide 267
  • …and so on, and yes, we can see that when we filter a list with elements 1, 2 and 3, ‘filterM’ does indeed compute the powerset for that list.
#268
slide 268
#269
slide 269
#270
slide 270
#271
slide 271
#272
slide 272
#273
slide 273
#274
slide 274
#275
slide 275
#276
slide 276
  • That’s ‘filterM’ introduced.
#277
slide 277
#278
slide 278
  • Now let’s introduce ‘foldM’.
#279
slide 279
  • We are all very familiar with the left fold, which applies a binary operator to an initial value and all the elements of a collection, going from left to right.
#280
slide 280
  • Here is a simple example in which we fold a list of integers using zero as the initial value and integer addition as the binary operator, so the result is 14, i.e. the sum of the integers.
#281
slide 281
  • We want to understand how ‘foldM’ works, so let’s compare the signature of a left fold with that of ‘foldM’.
  • While the binary function passed to a left fold returns an A, the binary function passed to ‘foldM’ also returns an A, but in a monadic context…
#282
slide 282
  • While the overall result of a left fold is an A, the overall result of ‘foldM’ is also an A, but in a monadic context.
#283
slide 283
  • In Scala, we can find ‘foldM’ in Cats, in the Foldable type class.
#284
slide 284
  • Let’s look at an example of using ‘foldM’ where the monadic context is Option, i.e. ‘foldM’ may or may not return a result.
#285
slide 285
#286
slide 286
  • If we look at the bottom of the slide, at the Scala code, we see a ‘binSmalls’ function that takes an accumulator and a current element, both integers.
  • If the current element is too large, ‘binSmalls’ returns None, otherwise it adds the current element to the accumulator, and returns the result in a Some.
  • In the first assertion, one of the integers in the list is too large, so at some point, ‘binSmalls’ returns None, and so ‘foldM’ short-circuits and also returns None.
  • In the second assertion, all the elements in the list are small, so ‘binSmalls’ always returns a Some of an integer, and so ‘foldM’, which is passed an initial accumulator of zero, peers into the Optional values, to obtain the integer elements of the list, and add them to the accumulator, thus summing up the integers, and returning the same value that a left fold would compute, i.e. 14, but wrapped in a Some.
#287
slide 287
  • That example, in which the function passed to ‘foldM’ returns an Optional value, is easy to understand.
  • It gets harder to understand, when the function passed to ‘foldM’ returns a list of values.
  • But that is exactly how we need to use ‘foldM’ in order to solve the N-Queens puzzle, so in the upcoming slides, we are going to look at three examples that do just that.
  • Before we do that though, let’s take another look at the definition of ‘foldM’.
#288
slide 288
#289
slide 289
#290
slide 290
#291
slide 291
#292
slide 292
#293
slide 293
  • Let’s look at the pseudocode in the central rectangle, the one with the bright white background.
  • The ‘foldM’ function takes a binary function that returns its result in a monadic context, an initial accumulator, and a list of elements.
  • What it does is repeatedly invoke the binary function with the current accumulator and the current list element, each time extracting a new accumulator value from the resulting monadic context.
  • It keeps doing that until it gets to the last list element, at which point it invokes the binary function one last time, and this time simply returns the function’s result, which is in a monadic context, since that is what ‘foldM’ does: it returns a result in a monadic context.
#294
slide 294
  • Lets’ skip these slides.
#295
slide 295
#296
slide 296
#297
slide 297
  • Now let’s look at an example of using ‘foldM’ where the monadic context is a List.
  • The thing to focus on, in order to understand the upcoming slides, is that while an ordinary fold manages one accumulator, in this example, ‘foldM’ manages zero, one, or more accumulators.
#298
slide 298
#299
slide 299
#300
slide 300
#301
slide 301
#302
slide 302
  • In our first example, we are going to pass ‘foldM’ a very unlikely function which ignores its two parameters, the accumulator and the current element, and always returns a list with two accumulators, both zero. When we fold an empty list, ‘foldM’ just returns the initial accumulator in a list (in this example the accumulator happens to be 9).
#303
slide 303
  • When we fold a non-empty list, ‘foldM’ invokes the given function with each of the current accumulators.
  • When the list is a singleton, ‘foldM’ calls the given function once (with the initial accumulator), which results in a list of two accumulators.
#304
slide 304
  • When we fold a list with one more element, ‘foldM’ builds on those two accumulators by invoking the given function once for each of the accumulators, which results in two lists of two accumulators, which get flatMapped into a list of four accumulators.
#305
slide 305
  • When we fold a list with one more element, ‘foldM’ builds on those four accumulators by invoking the given function once for each of the accumulators, which results in four lists of two accumulators, which get flatMapped into a list of eight accumulators.
#306
slide 306
  • And so on.
  • As you can see, the number of accumulators doubles with each iteration, so the number of accumulators in the result list is two to the power of the length of the list being folded.
#307
slide 307
  • Let’s skip these slides.
#308
slide 308
#309
slide 309
#310
slide 310
#311
slide 311
  • In the second example, we change the function that we pass to ‘foldM’ so that it is a bit more realistic, a bit more useful.
  • Instead of ignoring its accumulator and current element parameters, the function both adds the current element to the accumulator and subtracts it from the accumulator, returning a two-element list containing both the result of the addition and the result of the subtraction.
#312
slide 312
#313
slide 313
#314
slide 314
  • Here is how the computation unfolds, visually. You can see that ‘foldM’ is using the given function to add/subtract the current element to/from the accumulator.
  • You can also see that just like in the previous example, the number of accumulators doubles with each iteration.
  • By the way, the first and last of the values in the result list are the values that would be computed by a plain left fold, using addition and subtraction respectively, as the binary operation.
#315
slide 315
  • Let’s skip these slides.
#316
slide 316
#317
slide 317
#318
slide 318
#319
slide 319
  • In our third example, we change the type of the accumulator from integer to list, and we change the function passed to ‘foldM’ so that rather than adding and subtracting the current element from the accumulator, it both conses the current element onto the accumulator, and refrains from consing the element onto the accumulator.
#320
slide 320
#321
slide 321
#322
slide 322
  • Here is how the computation unfolds visually.
  • You can see that in each iteration, the function is both consing, and refraining from consing, the current element onto the current accumulators.
  • As in the previous two examples, the number of accumulators is again doubling with each iteration.
#323
slide 323
  • It turns out that given this particular function, ‘foldM’ computes the powerset of the input list. So we found another way of computing the powerset.
#324
slide 324
  • Let’s skip this slide.
#325
slide 325
  • That’s ‘foldM’ introduced.
#326
slide 326
#327
slide 327
  • Now that we understand how ‘foldM’ works, let’s see how we can use it to solve the N-Queens combinatorial puzzle.
#328
slide 328
  • We are going to use ‘foldM’ to generate the permutations of a list of integers, namely the column numbers where we can try to place a queen.
#329
slide 329
  • We already have a function that computes the permutations of columns 1,2,3,4.
  • But this function allows repetition (i.e. repeating column numbers - multiple queens allowed in the same column).
#330
slide 330
  • But what we want to compute are the permutations without allowing repetition.
  • i.e. if we have numbers 1,2,…N, we want to compute N! (N factorial) permutations, e.g. for N=3, we want to compute 3! = 3 x 2 x 1 = 6 permutations.
  • Notice how no digit is ever repeated in a permutation.
#331
slide 331
#332
slide 332
#333
slide 333
  • We can compute such permutations by passing to ‘foldM’ this update function.
  • The function takes two parameters. The second parameter, i.e. the current element of the list, is ignored.
  • The first argument, i.e. the accumulator, is a pair consisting of (1) a partial permutation computed so far, and (2) a list of choices still available.
  • For each of the available choices, the update function generates a new accumulator whose permutation is arrived at by consing the selected choice onto the partial permutation, and whose choices are arrived at by removing the selected choice from the available choices.
  • Notice how that means that each time the update function is invoked, a partial permutation is grown into one or more fuller permutations, and a list of available choices is shrunk into one or more lists of choices with one less available choice.
#334
slide 334
  • And that works: if we pass the update function to ‘foldM’, with an initial accumulator consisting of the empty permutation as the initial partial permutation, and the list of column numbers to be permuted as the initial list of choices, and if we then specify this same list as the input list to be folded, then ‘foldM’ does indeed compute the permutations.
  • Or rather, it computes a list of pairs, each of which contains one of the permutations, together with some noise in the form of an empty list (we will soon see how to deal with that).
  • Also, notice that the actual content of the input list doesn’t matter, only its length does, as it determines the number of iterations performed by ‘foldM’.
  • For convenience, we use as an input list the same list whose items we want to permute, e.g. 1,2,3.
#335
slide 335
#336
slide 336
#337
slide 337
#338
slide 338
  • Here is how the computation unfold visually. You can see that this time, the number of accumulators does not double every time.
  • Rather, it grows in a way that reflects the computation of N!.
  • In each iteration ‘foldM’ has to choose an integer for each of the current accumulators.
  • In the first iteration, there is one accumulator with three integers to choose from, so ‘foldM’ generated 3 x 1 = 3 accumulators.
  • In the next iteration, for each of those three accumulators, ‘foldM’ has to choose an integer, with two integers to choose from, so ‘foldM’ generated 3 x 2 = 6 accumulators.
  • In the last iteration, for each of those six accumulators, ‘foldM’ has to choose an integer, with only one choice available, so ‘foldM’ generated 6 x 1 = 6 accumulators.
  • So the number of accumulators is 3! = 3 x 2 x 1 = 6.
#339
slide 339
  • As mentioned before, while the result returned by ‘foldM’ includes the desired permutations, it also contains some noise in that each permutation is part of a pair that also contains an empty list.
  • Let’s skip the following slides about the boilerplate code needed to strip out the noise.
#340
slide 340
#341
slide 341
#342
slide 342
#343
slide 343
  • Our domain is the N-Queens puzzle, so let’s do some renaming to reflect that.
  • Let’s rename the ‘update’ function to ‘oneMoreQueen’. Let’s also rename ‘permutation’, the partial permutation generated so far, to ‘queens’, the columns in which we have already placed queens.
  • And let’s rename ‘choices’, the elements which we are still able to pick in the process of creating a permutation, to ‘emptyColumns’, the columns where we still have the choice of trying to place a queen.
#344
slide 344
  • While we are able to use ‘foldM’ to generate permutations without repetition, we are only interested in safe permutations, so let’s use the ‘isSafe’ function to filter out unsafe permutations.
#345
slide 345
  • Here is how we invoked ‘foldM’ earlier, when generating the permutations for 1,2,3.
#346
slide 346
  • But what we need to do now is write a function ‘queens’ that takes integer N, and returns the safe permutations for 1,2,…,N, e.g. for N=4 there are two such permutations.
#347
slide 347
  • So given that we have renamed ‘update’ to ‘oneMoreQueen’, is how we now need to invoke ‘foldM’.
#348
slide 348
  • And finally, as mentioned before, we need to strip out noise from the result of ‘foldM’, so let’s do that by mapping over the result with a function that replaces each pair with its head, which is a permutation, so that the result is mapped to a list of permutations.
#349
slide 349
  • And we are done. Here is the code, both in Haskell and in Scala, for the iterative solution to the N-Queens puzzle, using ‘foldM’.
  • Also included, for completeness, is the ‘isSafe’ function, which we are not going to discuss any further.
#350
slide 350
  • We have gone from a recursive algorithm to an iterative one.
#351
slide 351
  • Here is the recursive code for both Scala and Haskell (ignore the fact that I have refactored the ‘isSafe’ function and its helper functions).
#352
slide 352
  • Here is the transition we have gone through in Haskell.
#353
slide 353
  • And here is the original Haskell code, from the Rosetta Code site, which gave me the idea for this talk.
  • You can see that it includes quite a few comments, because understanding it is hard without comments.
  • It is still not easy, even with comments, especially for people who are new to ‘foldM’. Below is our current version of the code.
  • I have omitted any comments because we are now very familiar with its workings.
  • I believe our renaming of variables in ‘oneMoreQueen’ makes the latter easier to understand.
#354
slide 354
  • Here is the transition we have gone through in Scala.
#355
slide 355
  • And yes, it works.
#356
slide 356
  • It works in Haskell…
#357
slide 357
  • …and it works in Scala.
#358
slide 358
  • To conclude this talk, I want to point out that one way in which the iterative algorithm differs from the recursive one, results in the iterative algorithm doing less work.
  • Because the recursive algorithm does not remember where it has already placed previous queens, it will happily generate, and then test for safety, a candidate board that is obviously unsafe, because it contains two queens in the same column.
  • The iterative algorithm, on the other hand, does remember where it has already placed previous queens.
  • The number of columns still available in the ‘emptyColumns’ argument of the ‘oneMoreQueens’ function, shrinks over time to reflect attempts to place queens in available columns.
#359
slide 359
  • Let's skip these slides.
#360
slide 360
#361
slide 361
#362
slide 362
#363
slide 363
#364
slide 364
#365
slide 365
#366
slide 366
#367
slide 367
#368
slide 368
#369
slide 369
#370
slide 370
#371
slide 371
#372
slide 372
#373
slide 373
#374
slide 374
#375
slide 375
#376
slide 376
#377
slide 377
#378
slide 378
#379
slide 379
#380
slide 380
  • Let’s concentrate on the right-hand side of this table.
  • It shows that when N=4, the recursive algorithm generates 60 boards, of which 16 are candidate boards, i.e. they contain the requisite number of queens (4 in this case).
  • The iterative algorithm, on the other hand, generates 32 boards (about half the number), of which 4 are candidate boards (a quarter).
  • So, the iterative algorithm does less work.
#381
slide 381
  • This difference persists and widens as N grows.
  • For N=8, for example, the recursive algorithm generates over 15,718 boards, whereas the iterative algorithm generates 5,506 (about a third).
  • As for candidate boards, the recursive algorithm generates 2,495 boards, whereas the iterative one generates 312 (almost a tenth).
#382
slide 382
#383
slide 383
  • Let’s skip this slide.
#384
slide 384
  • Here is how the number of boards generated by the two algorithm compares as N grows from 4 to 8.
#385
slide 385
  • And here is the same illustration but for candidate boards.
#386
slide 386
  • That’s all folks.