N choose k recursive python

The recursion is based on a simple observation, for which I will give a combinatorial argument, as to why it is true, rather than a mathematical proof through formulae.

Whenever you choose k elements out of n, there are two cases:

  1. You choose element #n
  2. You don't choose element #n

Since these events are mutually exclusive, the total amount of combinations is given by the amount of combinations when choosing #n, and those when you don't choose #n.

Choosing element #n

Since we have already chosen one element, we need only choose another k-1 elements. Also, since we have decided upon one element – as to whether it is included or not – already, we only need to consider the remaining n-1 elements.

Thus, the amount of combinations for choosing element #n is given by

    subset(n - 1, k - 1)

Not choosing element #n

There are still k elements to choose, but since we have already made up our mind about element #n, there remain only n - 1 elements to choose from. Thus:

    subset(n - 1, k)

The base case

The recursion uses the fact, that we can usually differentiate between two situations, solutions where element n is part of that solution, and those where it is not.

However, such a distinction can not always be made:

  • When choosing all elements (corresponding to case n == k in code below)
  • or when choosing no elements at all (corresponding to case k == 0 in code below)

In these cases, there is only exactly one solution, hence

if k == 0:
    return 1
if n == k:
    return 1

Ensuring it works

To do that, we need to convince ourselves (or prove) that the base case is always hit at some point.

Let us assume, that n < k at some point. Since per our assumption, n was originally greater or equal to k, there must have been some point where n = k, because n and k decrease in unison or only n decreases by one, i.e. it follows

This implies, that there must have been a call to subset(n - 1, k) for it to happen, that n decreases below k. However, this is not possible since we have a base case on n = k where we return a constant 1.

We conclude that either n decreases at some point such that n = k, or decrease in unison exactly k times such that k = 0.

Thus, the base case works.

Combinations

A great many problems in math and computer science have recursive solutions. In this section of the lecture we will study a problem from mathematics that has an elegant recursive solution.

The combination symbol

N choose k recursive python
or "n choose k" has many applications in mathematics. In probability theory
N choose k recursive python
counts the number of ways that we can get k heads in a sequence of n flips of a fair coin.

N choose k recursive python
is computed via the formula

N choose k recursive python

Two special cases are easy to compute.

N choose k recursive python

N choose k recursive python

One way to compute a combination is to compute all the factorials involved and then divide the denominator into the numerator. This is not practical for even moderately large value of n because the factorials grow very quickly as a function of n. Instead, the most common method for computing a combination makes use of this recursive relationship:

N choose k recursive python

This recursive relationship coupled with the two special cases makes it possible to construct a recursive function definition in Python to compute a combination.

def C(n,k):
  if k == 1:
    return n
  if k == n:
    return 1
  return C(n-1,k-1)+C(n-1,k)

This works, and does compute the correct value for a combination.

This approach does have one flaw. Consider what happens when we try to compute C(8,5).

C(8,5) calls C(7,4) and C(7,5)
C(7,4) calls C(6,3) and C(6,4)
C(7,5) calls C(6,4) and C(6,5)
C(6,3) calls C(5,2) and C(5,3)
C(6,4) calls C(5,3) and C(5,4)
C(6,4) calls C(5,3) and C(5,4)
C(6,5) calls C(5,4) and C(5,5)

What you start to notice after a while is that the same functions are getting called multiple times. This effect becomes so severe for large n and k that the simple recursive function shown above becomes immensely inefficient.

We can construct a program that illustrates just how bad this problem is. The trick is to declare a global two dimensional array

counts = np.zeros((18,18),dtype=int)

that stores information about how many times we call C with each pair of parameter values n and k.

We then modify the function that computes the combination to record how many times it gets called with each pair of parameters.

def C(n,k):
  counts[n][k] += 1;

     if k == 1:
    return n
  if k == n:
    return 1
  return C(n-1,k-1)+C(n-1,k)

At the end of the program we dump this count data to a file.

def saveCounts():
  with open('counts.txt','w') as f:
    for row in counts:
      for col in row:
        f.write('{:5d}'.format(col))
      f.write('\n')

The results are shocking. Here is the set of counts that results when we try to compute C(16,10):

0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0 1287 1287    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0  495 1287  792    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0  165  495  792  462    0    0    0    0    0    0    0    0    0    0    0    0    0
    0   45  165  330  462  252    0    0    0    0    0    0    0    0    0    0    0    0
    0    9   45  120  210  252  126    0    0    0    0    0    0    0    0    0    0    0
    0    1    9   36   84  126  126   56    0    0    0    0    0    0    0    0    0    0
    0    0    1    8   28   56   70   56   21    0    0    0    0    0    0    0    0    0
    0    0    0    1    7   21   35   35   21    6    0    0    0    0    0    0    0    0
    0    0    0    0    1    6   15   20   15    6    1    0    0    0    0    0    0    0
    0    0    0    0    0    1    5   10   10    5    1    0    0    0    0    0    0    0
    0    0    0    0    0    0    1    4    6    4    1    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    1    3    3    1    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    1    2    1    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    1    1    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0

What we can read off from this table is that C(2,1), C(2,2), and C(3,2) each got called a total of 1287 times in the process of computing C(16,10). This problem of redundant function calls gets exponentially worse as n and k get larger. By the time you get to n = 30 this simple recursive method for computing C(n,k) becomes completely impractical.

Remembering results

The simple fix needed to eliminate redundant calls to a recursive function is called memoization. The idea is to remember any values we compute, so that the next time we are asked to compute a value we can look it up instead of recomputing it. This works best for recursive functions with a moderate number of integer parameters. In that case, we can remember values we have computed earlier by storing the values in a table. Whenever we are asked to compute a value, we start by consulting the table.

In the case of the function to compute combinations, we proceed as follows. We start by making a global two dimensional array to hold the remembered function values.

c = np.zeros((100,100),dtype=int)

We then rewrite the recursive function to use the table. Whenever we are asked to compute C(n,k) for a particular n and k, we start by checking the table to see if has already been computed. If it hasn't already been computed, we compute it and save the value in the table before returning with the result.

def C(n,k):
  result = c[n][k]
  if result == 0:
    if k == 1:
      result = n
    elif k == n:
      result = 1
    else:
      result = C(n-1,k-1)+C(n-1,k)
    c[n][k] = result
  return result

Replacing recursion with a table

An interesting side effect of the memoized solution is that it ends up filling up a table c[n][k] with values for C(n,k).

A more direct approach to doing the combinations problem efficiently is to just write code that fills the table with the right values.

We can do this with a function

def initTable():
  # Fill the table with correct values
  # for c[n][k] for all values of n and k

Once the c array is properly filled with values, we can rewrite the function to compute combinations

def C(n,k):
  if k <= n:
    return c[n][k]
  return 0

To write the initTable function we simply have to replicate the logic in the original recursive solution

def C(n,k):
  if k == 1:
    return n
  if k == n:
    return 1
  return C(n-1,k-1)+C(n-1,k)

in a set of loops. There are two things we need to do. The first is to write a pair of loops that fill in the c[n][k] entries for all n and k in the base cases. The base cases for combinations are k = 1 and k = n:

def initTable():
  # Fill in the base cases
  for n in range(0,MAX_N):
    c[n][k] = 1;
    c[n][1] = n;
    c[n][0] = 1;

Finally, we construct a loop that fills in c[n][k] for all of the remaining entries. We have to be careful when constructing the loop to make sure that the formula for c[n][k] only uses entries that the loop has already filled in. Here is the correct loop structure to do this.

MAX_N = 100
MAX_K = 100
c = np.zeros(MAX_N,MAX_K)

def initTable():
  # Fill in the base cases
  for n in range(0,MAX_N):
    c[n][k] = 1
    c[n][1] = n
    c[n][0] = 1
  # Fill in the remaining entries
  for n in range(2,MAX_N):
    for k in range(2,n):
      c[n][k] = c[n-1][k-1]+c[n-1][k]
    for k in range(n+1,MAX_K)
      c[n][k] = 0

Romberg integration

In an earlier lecture we used the trapezoid rule to estimate the area under a curve. The basic idea is to divide the range of integration into a large number of subintervals. Over each of these subintervals we construct a trapezoid whose area approximates the area under the curve over that subinterval.

N choose k recursive python

The area of one such trapezoid is h (f(xi) + f(xi+1))/2. Adding all of these trapezoid areas up gives us the trapezoid area estimate for N subintervals:

N choose k recursive python

The trapezoid rule forms the starting point for a more sophisticated method. The first step in this method is a technique called the Richardson extrapolation.

The Richardson extrapolation begins with a consideration of the error term in the trapezoid rule. When we deploy the trapezoid rule to estimate an integral with N subintervals and a step size of h, the trapezoid rule typically produces results that are accurate to a factor that is proportional to h2.

N choose k recursive python

The notation O(h2) I used here means that the error is of order h2. Another way to write this fact is to write the error term as a combination of various powers of h:

N choose k recursive python

(This is somewhat beyond the scope of the discussion here, but it does turn out that the error term from the trapezoid rule contains only even powers of h.)

The next step is deploy the Richardson extrapolation, which is a strange, but effective algebraic trick. In this trick we write down the error relationship twice, once for N trapezoids as we saw above, and a second time for 2 N trapezoids. Increasing the number of trapezoids by a factor of 2 reduces the step size from h to h/2, so we get these two expressions for the error:

N choose k recursive python

N choose k recursive python

Multiplying the second equation by 4 and subtracting the first equation from it produces

N choose k recursive python

We can rewrite this as

N choose k recursive python

or

N choose k recursive python

By using this simple algebraic trick, we have switched from a method that has an error term with magnitude O(h2) to a method with an error term with magnitude O(h4).

Better yet, this trick can be iterated. I will spare you the details, but iterating this trick is the basis for an integration technique called Romberg integration. Here is a summary of the Romberg technique. The function R(k,j) represents the jth iteration of this technique starting from a trapezoid rule with N = 2k-1 steps.

R(a,b,k,1) = trapezoidArea(a,b,2k-1)

N choose k recursive python

Here is a Python program that uses this recursive formula to compute integral estimates.

import math

# This is the function we will work with for this example
def f(x):
    return math.sqrt(4.0 - x**2)

# Compute an approximation to the integral
# of f(x) from a to b by using trapezoids
# with steps of size (b-a)/N
def trapezoidArea(a, b, N):
    h = (b - a) / N
    x = np.arange(a, b, h)

    return h * (f(x).sum()) - h * f(a) / 2 + h * f(b) / 2


# Recursively defined Romberg formula
def romberg(a,b,k,j):
    if j == 1:
        return trapezoidArea(a,b,2**(k-1))
    elif k < j:
        return 0
    else:
        return romberg(a,b,k,j-1) + (romberg(a,b,k,j-1)-romberg(a,b,k-1,j-1))/(4**(j-1)-1)

print("  j  estimate  error");
for j in range(2,6):
    estimate = romberg(0.0,2.0,4*j,j)
    error = math.fabs(math.pi - estimate)
    print("{:>3d}  {:f}  {:.16f}".format(j,estimate,error))

It takes a little bit of effort to unpack what this program is doing. The loop at the bottom is computing R(a,b,4 j,j) for j in the range from 2 to 5. For a given value of j, this method starts with a trapezoid rule estimate using 2k-1 = 24 j - 1 steps. When j = 5 this means using 220 - 1 = 524288 steps.

This program has one last flaw in it, and this flaw seriously bogs down the computation starting at around j= 4. The problem here is one of excessive recursion. As you can see from the code above, a typical call to the romberg function triggers three additional recursive calls. If each of those in turn triggers more calls to the romberg function, you can end up making a huge number of calls this function.

To counteract this problem, we can use a simple caching strategy. Each time we compute a value for romberg(a,b,k,j) for a particular combination of k and j, we should make a note of that result. The next time we end up needing that value we can simply look up the result without having to recompute anything.

The final version of the program makes use of this caching strategy. We set up a two dimensional list R that can store values for many combinations of k and j. The romberg function then hands most of the work off to a recursive internal function, r_int. r_int does an optimized calculation by first checking the cache to see if a result has already been recorded for this combination of k and j. If so, it immediately stops and returns the cached value. If not, it falls back to the recursive formula to compute a result, and then caches that result before returning an answer.

import math

# This is the function we will work with for this example
def f(x):
    return math.sqrt(4.0 - x**2)

# Compute an approximation to the integral
# of f(x) from a to b by using trapezoids
# with steps of size (b-a)/N
def trapezoidArea(a, b, N):
    h = (b - a) / N
    x = np.arange(a, b, h)

    return h * (f(x).sum()) - h * f(a) / 2 + h * f(b) / 2

# Recursively defined Romberg formula with caching
# r_int will compute R[k,j] but then remember that
# result in the R list to speed up subsequent calculations.
def r_int(a,b,k,j,R):
    if R[k][j] != -1: # Have we seen this k,j before?
        return R[k][j] # If so, just return the cached value

    # If not, compute, cache, and return the result
    if j == 1:
        R[k][j] = trapezoidArea(a,b,2**(k-1))
    elif k < j:
        R[k][j] = 0
    else:
        R[k][j] = r_int(a,b,k,j-1,R) + (r_int(a,b,k,j-1,R)-r_int(a,b,k-1,j-1,R))/(4**(j-1)-1)
    return R[k][j]

def romberg(a,b,k,j):
    R = [[-1 for c in range(j+1)] for r in range(k+1)]
    return r_int(a,b,k,j,R)

print("       N  estimate  error");
for j in range(4, 27, 2):
    estimate = romberg(0.0, 2.0, j, j)
    error = np.fabs(np.pi - estimate)
    print("{:>8d}  {:f}  {:.16f}".format(2**(j-1), estimate, error))

Note that since we have a more efficient method now we can safely expand the range of j that we compute with. Here are the results that this program returns.

       N  estimate  error
       8  3.124218  0.0173744893594270
      32  3.139447  0.0021459042519050
     128  3.141325  0.0002678879206544
     512  3.141559  0.0000334785532052
    2048  3.141588  0.0000041846140313
    8192  3.141592  0.0000005230705571
   32768  3.141593  0.0000000653836278
  131072  3.141593  0.0000000081729490
  524288  3.141593  0.0000000010216188
 2097152  3.141593  0.0000000001277005
 8388608  3.141593  0.0000000000159650
33554432  3.141593  0.0000000000019940

Programming Assignment

In one of the first lectures in this course we saw an example of the polynomial interpolation problem. Given a list (x0 , y0) , (x1 , y1) , … , (xn , yn) of points to interpolate, we want to construct an nth degree polynomial that passes through the points.

Newton solved this problem by looking for a polynomial that takes a special form, called the Newton polynomial:

Pn(x) = a0 + a1(x - x0) + a2(x - x0)(x - x1) + ⋯ + an(x - x0)(x - x1)⋯(x - xn-1)

Newton was able to come up with a recursive relationship that the coefficients of this polynomial satisfied. He introduced a special notation

N choose k recursive python

Using this notation the coefficients of the Newton polynomial are

ak = f[0,k]

Write a program that can read a list of data points from a text file and construct a Newton polynomial that interpolates the points. Each line in the text file contains one x value and one y value. Your program should read those data points and store them in two lists, xs and ys. Your program will then prompt the user to enter a value for x and then compute and print the value of the Newton polynomial at that x.

Your program should contain at least one function, f(i,j), to compute the factors f[i,j]. You may also want to write a second function, term(k,x) to compute the kth term of the interpolating polynomial:

f[0,k] (x- x0)(x - x1)⋯(x - xk-1)

To test your program, here is some test data.

xy
1.0 0.7651977
1.3 0.6200860
1.6 0.4554022
1.9 0.2818186
2.2 0.1103623

The polynomial of degree four that passes through these points has value approximately 0.5118200 at x = 1.5.