Accumulators

Accumulation of a result is a very useful way to view algorithms that traverse data structures. All recursive algorithms on recursive data types, e.g. lists and trees, can be viewed as accumulator algorithms. Accumulator algorithms howver, or NOT limited to recursive data structures!

The fundamental notion of an accumulator algorithm is that the data needed for a particular calculation is spread throughout a data structure, thus necessitating a traversal process that "accumulates" the answer as the algorithm "travels" from one data element to the next. Note that the notion of "accumulation" here is at its most general and does not necessarily mean an additive or multiplicative process--it could be one where each element is kept or discarded, as a filtering algorithm would do.

"Reverse Accumulation"

Let's look at an example of what the HTDP book calls a "natural" recursion algorithm:

;; sum a list of numbers
(define (sum lon)
   (cond
      [(empty? lon) 0]
      [(cons? lon) (+ (first lon) (sum (rest lon)))]))

Look at the role of the recursive result "(sum (rest lon))". The cons? clause can be described as the addition of first with the recursive result. We can say that the recursive result is the accumulation of the sum of the rest of the list. That is, the value returned by the recursive call to "sum" is the accumulated sum so far in the algorithm. All the cons? clause is doing is to create a new value for the accumulated result by adding first to recursive result.

In this sort of algorithm, the final answer is created by sweeping the data from the rear of the list (i.e. from the empty list towards the front of the list) and accumulating a sum as one sweeps through the data. The accumulated result is passed along via the return value of the function.

We will call this style of accumulation a Reverse Accumulation algorithm because the result is being accumulated as one "reverses" back out of the data structure.

Here's another example that uses a more generalized notion of "accumulation":

;; return the largest positive number in a list of numbers or zero if there are no positive values.
(define (get_largest_pos lon)
   (cond
       [(empty? lon) 0]
       [(cons? lon)
          (local
               [(define acc (get_largest_pos (rest lon)))]
               (if (> (first lon) acc) (first lon) acc))]))

Here we can see that the accumulated value is just the largest value from the rest of the list at any given point.

We can extract a template for reverse accumulation algorithms, which is based on the template for a list-of-any:

(define (f_rev loa)
   (cond
       [(empty? loa) base]
       [(cons? loa) (... (first loa)...(f_rev (rest loa))]))

But we can take this a step further by noting that the "..."'s in the cons? clause just represent some function applied to first and the recursive result. In addition, the "base" is just some parameter that we could have passed in as an input parameter. Thus, we can rewrite the template as follows, including a name change, which I will justify afterwards:

;; foldr: (lambda any1 any2 -> any2) any2 list-of-any1 --> any2
(define (foldr func base loa)
   (cond
       [(empty? loa) base]
       [(cons? loa) (func (first loa)(foldr func base_value (rest loa))]))

The template is gone! The code is now 100% concrete, with nothing left to fill in! We have defined a function, called "foldr" ("fold-right"), which the same foldr defined in the last lab:

(foldr f base (list x1 x2 ... xn)) = (f x1 (f x2 ... (f xn base)))

Convince yourself that this definition is the generic process of reverse accumulation in a list.

foldr is a higher order function that performs a reverse accumulation algorithmic process on a list.

We can see by this analysis that a reverse accumulation requires 2 parts, the two input parameters to foldr other than the list itself:

A function that takes first and the accumulator (the recursive result) and calculates the next accumulator value.
A base value that is the intial accumulator value (that the empty list returns).

The above examples can thus be written in terms of foldr, by simply supplying the appropriate accumulator generating function and the accumulator base value:

(define (sum lon)
    (foldr + 0 lon))

(define (get_largest_pos lon)
   (foldr (lambda (x rr)
              (if (> x rr) x rr))
          0
          lon))

"Forward Accumulation"

A general mantra that we have about moving data during an algorithm's execution is to say that we are "moving data from where it is, to where it can be processed".

In a reverse accumulation algorithm on a list, we recur all the way to the empty list before we start accumulating our result because that is the only place where the return value is unequivocally defined. The result is thus accumulated in the "reverse" direction as we slowly exit the recursion layer by layer.

If we can accumulate results by moving data from the rear of the list towards the front of the list, can we also accumulate a result by moving data the other direction, namely from the front of the list towards the rear? But of course!

Let's look at how we would sum a list of numbers using "forward accumulation":

(define (sum_fwd lon)
  (cond
    [(empty? lon) 0]
    [(cons? lon)
     (local
       [(define (helper acc aLon)
          (cond
            [(empty? aLon) acc]
            [(cons? aLon) (helper (+ (first aLon) acc) (rest aLon))]))]
       (helper (first lon) (rest lon)))]))

Notice, first of all, that the function requires a helper function? Why? Because the only way to pass data forward in the list is to use an input parameter, but since forward accumulation is an implementation detail of the function, there are no (and should not be any) provisions in the input parameters of the original function, sum_fwd, for passing the accumulated result. Thus a helper function with an extra input parameter is needed, to handle the accumulating value.

All the "outer" function, "sum_fwd", does is to set up the initial value of the accumulator. It is is not a recursive function. Only the helper is recursive.

We can write a template for forward accumulation, once again based on the template for a list-of-any:

(define (f_fwd loa)
    (cond
        [(empty? loa) base]
        [(cons? loa)
            (local
                 [(define (helper acc loa2)
                      (cond
                          [(empty? loa2) acc]
                          [(cons? loa2) (helper (... (first loa2)...acc...) (rest loa2))]))]
                 (helper (...(first loa)...base...) (rest loa)))]))

It is very tempting to say that the above is equivalent to:

(define (f_fwd loa)
    (cond
        [(empty? loa) base]
        [(cons? loa)
            (local
                 [(define (helper acc loa2)
                      (cond
                          [(empty? loa2) acc]
                          [(cons? loa2) (helper (... (first loa2)...acc...) (rest loa2))]))]
                 (helper base loa))]))

But to make that leap requires 2 conditions to be true:

The "..."'s in both the helper and the outer cons? clause must be identical.
"base" must represent a value, as opposed to something such as a exception.

For instance, the finding the largest element in a list follows only the first template:

(define (get_largest lon)
  (cond
    [(empty? lon) (error "no largest in an empty list")]
    [(cons? lon)
     (local
       [(define (helper acc lon2)
          (cond
            [(empty? lon2) acc]
            [(cons? lon2) (helper (if (> (first lon2) acc) (first lon2) acc) (rest lon2))]))]
       (helper (first lon) (rest lon)))]))

Child pages

Accumulators_S11

Accumulators

"Reverse Accumulation"

"Forward Accumulation"