Enigmatic Code

Programming Enigma Puzzles

Running the first program

“Notes on Note G”

[ Part 1 | Part 2 | Part 3 ]

[Note: WordPress sometimes has problems including program code in posts, so I’ve made the code used in these articles available on GitHub – [link]]

[Note: The programs can also be run without the need to install anything @repl.it]

I recently watched the BBC4 documentary Calculating Ada: The Countess of Computing (Hannah Fry’s 10-part radio series Computing Britain is also an interesting listen)[¹]. And although I know that Ada Lovelace is credited with publishing the first computer program, I hadn’t ever seen the form that this program took before. It’s not hard to track down – on Ada’s Wikipedia page there is an image of the diagram ([image] – “Diagram for the computation by the Engine of the Numbers of Bernoulli”).

The diagram itself was part of a note (Note G) that was attached to a translation of a French transcript (by an Italian military engineer named Luigi Frederico Menabrea – who would later become the Italian Prime Minister) of a seminar given by Charles Babbage at the University of Turin in 1840. It was the first and only time Babbage gave a public lecture on the Analytical Engine. Ada was commissioned to translate the paper into English. After nine months of working she published her translation, along with a set of notes (which were more than twice as long as the original paper). This was the first (and, for a hundred years, the only) text on computer programming. It is in one of the notes that the diagram that is considered to be the worlds first published computer program appeared. (The full text of Ada’s paper is available here [link], along with a large version of the diagram [link]).

Here it is:

Diagram for the computation of Bernoullinumbers

“Diagram for the computation by the Engine of the Numbers of Bernoulli”

It’s not what we think of now as a computer program – after all it pre-dates computer programming languages. Instead it essentially the description of a sequence of operations performed by the Analytical Engine. So it is more like a trace of the program execution than the source code of the program itself, and it only includes the actual arithmetic operations that the engine performs. The control flow of the execution is dealt with in the attached text.

Of course the Analytical Engine was never completed, so the actual form of the instructions is not completely clear. However, I’m not trying to construct copy of the punched cards necessary to run Ada’s program. I’m attempting to transliterate her algorithm into a modern programming language in order to understand it, and the Analytical Engine, better.

Like Ada I have found that my own notes have become somewhat longer than I anticipated.

First I’ll explain what the program does. The purpose of the program is to compute Bernoulli Numbers. Ada’s program is concerned with calculating the odd-numbered Bernoulli Numbers[²], and is based on the following equation (given in Note G) for calculating B[2n − 1], given the already computed values of B[1], B[3], …, B[2n − 3]:

A[0] + A[1]B[1] + A[3]B[3] + … + B[2n − 1] = 0

where the values of the A[] coefficients are defined as follows:

A[0] = (−1/2) (2n − 1)/(2n + 1)
A[1] = 2n/2 = n
A[3] = 2n(2n − 1)(2n − 2) / (2.3.4)
A[5] = 2n(2n − 1)(2n − 2)(2n − 3)(2n − 4) / (2.3.4.5.6)

For k > 1, each A[k] is derived from A[k − 2] by multiplying the next two decreasing terms into the numerator, and the next two increasing terms into the denominator. (Note that for each value of n the sequence of A[] coefficients are different. We might more properly think of them as A[nk]).

Ada’s program works by starting at n=1, the equation for deriving B[1] is:

A[0] + B[1] = 0

and:

A[0] = (−1/2) (2n − 1)/(2n + 1) = −1/6

So:

B[1] = 1/6

We then consider n=2, using the previously computed value for B[1] we have:

A[0] + A[1]B[1] + B[3] = 0

and:

A[0] = (−1/2) (2n − 1)/(2n + 1) = −3/10
A[1] = 2

So:

B[3] = −1/30

We then consider n=3, and use the previously computed value for B[1] and B[3] to calculate B[5]. And the procedure can be repeated as many times as we like (or until we run out of space to store the results – the proposed design for the Analytical Engine would have had room for 1,000 variables, each containing 50 decimal digits). Each time we use the results from all the previous runs to calculate the next number in the sequence.

To explain the implementation of the algorithm here is my own implementation of it using modern Python idioms. This program is constructed using the same basic structure as Ada’s program.

from fractions import Fraction

# my implementation of the algorithm
def bernoulli():

  # results
  Bs = list()

  # start at n=1
  n = 1
  # calculate the sequence
  while True:

    # result = A0
    r = -Fraction(2 * n - 1, 2 * n + 1) / 2

    # A1 = n
    A = n
    # for each B[k] already determined calculate the corresponding A[k]
    for (i, B) in enumerate(Bs):
      if i > 0:
        # multiply in the 2 additional terms
        j = 2 * i - 1
        A *= Fraction(2 * n - j, 2 + j)
        j += 1
        A *= Fraction(2 * n - j, 2 + j)
      # add A[k] * B[k] into the result
      r += A * B

    # the computed bernoulli number is -r
    B = -r
    # return the number
    yield B
    # add it to the result list
    Bs.append(B)
    # increase n
    n += 1

# run N iterations of the algorithm
N = 10

# allow N to be specified as a command line argument
import sys
if len(sys.argv) > 1:
  N = int(sys.argv[1])

k = 1 # ada's numbering (normal numbering is k + 1)
K = 2 * N + 1 # max k
for r in bernoulli():
  print("B[{k}] = {r}".format(k=k, r=r))
  k += 2
  if k == K: break

[Program 1 - ada1.py]

Before we examine a more direct implementation of Ada’s diagram, first some notes on my own implementation:

  1. (Line 1) I’m using Python’s Fraction library to generate the Bernoulli sequence. This makes it easy to check the computation against existing lists (which give them in rational form), and provides exact answers. The Analytical Engine would have computed the numbers as fixed point decimal fractions. (So small errors would accumulate in the final decimal places, and as these results are fed back into further computations the errors would compound. It would be best to use more decimal places than required in the results so that suitably accurate truncated results can be generated).
  2. (Line 4) The bernoulli() function will generate successive Bernoulli numbers. I’ve implemented it as Python co-routine, so results are returned with the yield operator as they are determined. It’s up to the calling code to consume as many results as it requires. The diagram given by Ada describes the generation of a single Bernoulli number. And while her program itself sets up some of the data required for the next run, it would also require some additional modification of the cards before it is run again to compute the next number in the sequence.
  3. (Line 18) I’ve simplified the calculation of A1 from (2n/2) to just (n).
  4. (Line 41) We can specify how many numbers in the sequence we want the program to compute (the default is 10). They are printed out according to Ada’s numbering scheme.

The program is certainly non-trivial. It involves two loops (one nested in the other) and conditional execution, as well as the storing of the computed results to use in subsequent calculations.

The diagram published by Ada shows the computation of B[7], the fourth number in the sequence. The table consists of the following columns:

1. Line number.
2. Operation.
3. Operands.
4. Assignment of results.
5-6. Comments.
7-9. Parameters.
10-19. Variables.
20-23. Results.

Only columns 2, 3 and 4 are instructions to the computer, the rest are essentially supporting documentation to explain the program to the reader.

There are a few apparent errors (the first programming bugs) in the originally published diagram:

  1. At line 4 the operation is given as v5 ÷ v4, although the comments and text make it clear that this should be v4 ÷ v5.
  2. At line 23 the formula for A3 has two 3’s in the denominator, instead of a 3 and a 4 (as in the previous line). This is just an error in the comment and would not affect execution of the program.
  3. At line 25 it looks like v6 and v7 are reset to their initial values (of zero). In fact it’s v7 and v13 that need to be reset for the algorithm to work correctly. In the accompanying text Ada refers to resetting v6, v7, and v13 to zero (although it is not necessary to zero v6, as the first thing the program does (operation 1) is to assign 2n to v4, v5, and v6).

These are all “typos” rather than “thinkos”, and may not even have been present in Ada’s original notes.

Here is my Python program with the bernoulli() function replaced with a modified version that reflects a more direct transliteration of Ada’s description:

from collections import defaultdict
from fractions import Fraction

# ada lovelace's algorithm in Python
def bernoulli():

  # initially all varibles in the store are at 0
  v = defaultdict(int)

  # program start

  # initialise the variables
  v[1] = Fraction(1) # constant
  v[2] = Fraction(2) # constant
  v[3] = Fraction(1) # n = 1

  # outer loop (compute B[2n - 1])
  while True:

    # pseudo-block to permit "break"
    while True:

      # 0: set index register
      i = 1

      # 1: v4 = v5 = v6 = 2n
      v[4] = v[5] = v[6] = v[2] * v[3]

      # 2: v4 = 2n - 1
      v[4] = v[4] - v[1]

      # 3: v5 = 2n + 1
      v[5] = v[5] + v[1]

      # 4: v11 = (2n - 1)/(2n + 1) (the diagram seems to say v[5] / v[4]) [FIX]
      v[11] = v[4] / v[5]

      # 5: v11 = (1/2) (2n - 1)/(2n + 1)
      v[11] = v[11] / v[2]

      # 6: v13 = -(1/2) (2n - 1)/(2n + 1) = A0
      v[13] = v[13] - v[11]

      # 7: v10 = n - 1
      v[10] = v[3] - v[1]
      # branch if zero to operation 24
      if v[10] == 0: break

      # 8: v7 = 2
      v[7] = v[2] + v[7]

      # 9: v11 = (2n)/2 = A1 [why not just set v[11] = v[3] instead of 8 & 9 ?]
      v[11] = v[6] / v[7]

      # 10: v12 = A1 * B1
      v[12] = v[20 + i] * v[11]
      i += 1

      # 11: v13 = A0 + A1 * B1
      v[13] = v[12] + v[13]

      # 12: v10 = n - 2
      v[10] = v[10] - v[1]
      # branch if zero to operation 24
      if v[10] == 0: break

      # for each computed result, B = B3 [1], B5 [2], ...
      while True:

        # 13: v6 = 2n - 1 [1], 2n - 3 [2], ...
        v[6] = v[6] - v[1]

        # 14: v7 = 3 [1], 5 [2], ...
        v[7] = v[1] + v[7]

        # 15: v8 = (2n - 1)/3 [1], (2n - 3)/5 [2], ...
        v[8] = v[6] / v[7]

        # 16: v11 = (2n)/2 * (2n - 1)/3 [1], (2n)/2 * (2n - 1)/3 * (2n - 3)/5 [2], ...
        v[11] = v[8] * v[11]

        # 17: v6 = 2n - 2 [1], 2n - 4 [2], ...
        v[6] = v[6] - v[1]

        # 18: v7 = 4 [1], 6 [2], ...
        v[7] = v[1] + v[7]

        # 19: v9 = (2n - 2)/4 [1], (2n - 4)/6
        v[9] = v[6] / v[7]

        # 20: v11 = (2n)/2 * (2n - 1)/3 * (2n - 2)/4 = A3 [1], (2n)/2 * (2n - 1)/3 * (2n - 2)/4 * (2n - 3)/5 * (2n - 4)/6 = A5 [2], ...
        v[11] = v[9] * v[11]

        # 21: v12 = A3 * B3 [1], A5 * B5 [2], ...
        v[12] = v[20 + i] * v[11]
        i += 1

        # 22: v13 = A0 + A1 * B1 + A3 * B3 [1], A0 + A1 * B1 + A3 * B3 + A5 * B5 [2], ...
        v[13] = v[12] + v[13]

        # 23: v10 = n - 3 [1], n - 4 [2], ...
        v[10] = v[10] - v[1]
        # branch if non-zero to operation 13
        if v[10] == 0: break

      # terminate the pseudo-block
      break

    # 24: result (-v13) is copied into the results
    r = v[20 + i] = v[20 + i] - v[13]

    # the result could be printed here
    # we pass it back to the calling routine
    yield r

    # 25: increase n, and reset working variables
    v[3] = v[1] + v[3]
    v[7] = v[13] = 0

    # branch to operation 0

# run N iterations of the algorithm
N = 10

# allow N to be specified as a command line argument
import sys
if len(sys.argv) > 1:
  N = int(sys.argv[1])

k = 1 # ada's numbering (normal numbering is k + 1)
K = 2 * N + 1 # max k
for r in bernoulli():
  print("B[{k}] = {r}".format(k=k, r=r))
  k += 2
  if k == K: break

[Program 2 - ada2.py]

Ada’s diagram shows how the program would run when n=4, and it not a direct indication of the punched cards that would be fed to Analytical Engine. Of course as the Analytical Engine was never completed it’s actual instruction set is not something we can know for certain. But the way it is described as working is as follows:

  1. An operation card indicates the operation that the engine is to perform. (One of +, −, ×, ÷).
  2. A variable card indicates which variable is transferred from the store (memory) to the first Ingress Axis of the mill (CPU). This provides the first operand.
  3. A second variable card indicates which variable is transferred from the store to the second Ingress Axis if the mill. This provides the second operand.
  4. The operation is then performed, the result appears in the Egress Axis of the mill.
  5. A variable card indicates which variable in the store the contents of the Egress Axis are transferred to.

At stage 5 there may be multiple variable cards instructing that the result is stored in several variables, as we see in the first operation of Ada’s table. If we were to write an assembly language version of this operation it would look something like this:

# operation 1
MUL
LOAD v[2]
LOAD v[3]
STORE v[4]
STORE v[5]
STORE v[6]

And each of these opcodes corresponds to a card that acts as instructions to the Analytical Engine.

Ada’s accompanying text describes the need for some common programming idioms that we take for granted in modern programming languages, but are not present in the diagram (which just shows the arithmetic operations performed).

First there is the need for conditional execution. She mentions that when computing B[1] operation 6 would complete the calculation. In my program at line 47 I exit the block when v10 becomes 0. Similarly, as Ada says, operation 12 works in a similar way. Again in my program I conditionally exit the block if v10 becomes 0.

Secondly there is the need for repeated execution of a sequence of instructions (looping). This is dealt with extensively in the notes where the repeated execution of operations 13-23 is described.

Both these could be achieved with a conditional branch instruction. Although the mechanism for branching is not explicitly mentioned in the paper it is clear that the Analytical Engine implements such a function. For example, if the Analytical Engine set a flag when a result was 0 (as microprocessors often do), then there could be an instruction to skip forward or backwards a certain number of program steps and resume operation from there. The instructions were to be provided on a punched cards, so this would require the card reader to skip forward or rewind a certain number of cards.

(In the Fourmilab simulation of the Analytical Engine a “run-up lever” is used. This is set by an operation if the sign of the result of the operation being different from the sign of the first operand (or if an overflow occurs). A conditional branch can then be made based on the state of the lever. So, we could test if a variable was zero by subtracting 1 from it. The only way the sign will change is if the variable initially held the value of 0 (so the result is −1), and the lever will be set. As Ada doesn’t show these operations in her table I’m going to simplify by referring to a “branch if zero” instruction. In a similar way branching on other conditions can be implemented).

This would allow us to place an instruction after operation 7 and operation 12 of “branch if zero to operation 24”. The inner loop would be implemented by placing an instruction after operation 23  of “branch if non-zero to operation 13”. The outer loop would be implemented by placing an instruction after operation 25 of “branch always to operation 1” (or more likely to branch if n was less than a predefined constant value, so that a Bernoulli sequence of the specified length is generated).

The text also hints at the notion of indirection, where one variable (or register) can tell us which of a number of other variables we need to access. This is necessary as each time we iterate through the operations 13-23, we have computed the A[] coefficient after executing operation 20, but then in operation 21 we multiply this the corresponding number already computed in the B[] sequence. The first time round we would need B[3], stored in variable v22 (as specified in the operands for operation 21 in the diagram), but the next time round the loop we would need the B[5] value stored in v23, and so forth. Likewise when the final result (B[7]) is calculated it is stored by operation 24 into v24 and n is increased. The next time we get to operation 24 the final result will be B[9] and it will be stored into v25. Although the diagram itself does not indicate how the instructions of operations 21 and 24 are modified as the program is executed it is clear that it is an issue that Ada has considered.

From the way Ada describes the program it seems that way this would happen is that for each value of n the program is run once. Then it must be set up slightly differently for next run. Ada notes that exactly the same sequence of operation cards can be reused, but the variable cards must be modified slightly on each iteration to achieve the required results.

One way the Analytical Engine could do indirection would be to allow the index of a variable to be specified in another variable, so instead of a “LOAD v[12]” (load the Ingress Axis from variable 12 in the store), we would have something like “LOADI v[12]” (load the Ingress Axis from the variable in the store whose index is stored in v12). So if v[12] = 21, “LOADI v[12]” would be the same as “LOAD v[21]”.

But variables in the store hold 50 decimal digits and indices into the store are only 3 decimal digits, so it is a bit odd. Instead we might consider that the Analytical Engine could be equipped with a few special purpose three decimal digit index registers, these could then be used to provide an additional offset in a load instruction.

For the purposes of Ada’s program it would be necessary to only have a single additional index register. We could imagine a that “LOADI v[20]” would correspond to “LOAD v[20 + i]” where i represents the value in the index register. So if i = 1 the instruction “LOADI v[20]” would be the same as “LOAD v[21]”. And there would be a corresponding “STOREI” instruction. To increase the offset we would need an “INC” instruction, which would increment the number in the index register by 1. This would be sufficient to implement Ada’s program.

But perhaps this approach is too influenced by modern computing. A different approach would be to allow the LOAD and STORE instructions to get a variable from an alternate stack of cards. “LOAD alt” and “STORE alt”. That way the alternate stack of cards could be filled with the addresses of v21 to (say) v30. And in operation 10 the address that the first LOAD card uses isn’t punched on the card itself, instead it is read from the alternate stack (it’s the first card in the stack so it would be v21), and the stack moves to the next card. When we reach operation 21 we would again use a “LOAD alt” instruction to get the index of the variable from the alternate stack. The first time round the loop it will be v22, then v23 and so on. When the loop ends, and we have processed all the previously calculated results the next card in the alternate stack will be the correct address to store the newly computed number. So operation 24 uses “STORE alt” and the number is stored.

Finally at the end of the diagram as well as looping back to the beginning we also rewind the alternate card deck and we are ready to go again without needing to alter any of the cards. After 10 iterations the first 10 Bernoulli numbers will have been stored in the variables referenced by the alternate stack, and the program can halt.

To produce the transliteration of Ada’s program that can run without intervention I have adopted the addition of an index register to allow indirection, although, as far as I know, this is not historically accurate.

Finally Ada notes that the process described in her diagram produces results of the correct magnitude, but not the correct sign. If there is no simple way to change the sign of a value, this can easily be remedied by subtracting the result (in v[13]) from the variable that the result is to be stored in (as it will initially be 0), and then storing it back into the result variable. Consequently I’ve adjusted operation 24 from “v[24] = v[13] + v[24]” to be “v[24] = v[24] − v[13]”, the sign is then correctly set for value to be reused by the algorithm.

Writing an equivalent pseudo-assembly language version of my transliteration would look something like this:

# initialise the variables
SET v[1] 1 # constant 1
SET v[2] 2 # constant 2
SET v[3] 1 # n = 1

# operation 0: initialise the index register
SET i 1

# operation 1
MUL
LOAD v[2]
LOAD v[3]
STORE v[4]
STORE v[5]
STORE v[6]

# operation 2
SUB
LOAD v[4]
LOAD v[1]
STORE v[4]

# operation 3
ADD
LOAD v[5]
LOAD v[1]
STORE v[5]

# operation 4
DIV
LOAD v[4]
LOAD v[5]
STORE v[11]

# operation 5
DIV
LOAD v[11]
LOAD v[2]
STORE v[11]

# operation 6
SUB
LOAD v[13]
LOAD v[11]
STORE v[13]

# operation 7
SUB
LOAD v[3]
LOAD v[1]
STORE v[10]

# branch if zero
BRANCH ZERO (operation 24)

# operation 8
ADD
LOAD v[2]
LOAD v[7]
STORE v[7]

# operation 9
DIV
LOAD v[6]
LOAD v[7]
STORE v[11]

# operation 10
MUL
LOAD v[20 + i] # indirection
LOAD v[11]
STORE v[12]
INC i # index register is incremented

# operation 11
ADD
LOAD v[12]
LOAD v[13]
STORE v[13]

# operation 12
SUB
LOAD v[10]
LOAD v[1]
STORE v[10]

# branch if zero
BRANCH ZERO (operation 24)

# operation 13
SUB
LOAD v[6]
LOAD v[1]
STORE v[6]

# operation 14
ADD
LOAD v[1]
LOAD v[7]
STORE v[7]

# operation 15
DIV
LOAD v[6]
LOAD v[7]
STORE v[8]

# operation 16
MUL
LOAD v[8]
LOAD v[11]
STORE v[11]

# operation 17
SUB
LOAD v[6]
LOAD v[1]
STORE v[6]

# operation 18
ADD
LOAD v[1]
LOAD v[7]
STORE v[7]

# operation 19
DIV
LOAD v[6]
LOAD v[7]
STORE v[9]

# operation 20
MUL
LOAD v[9]
LOAD v[11]
STORE v[11]

# operation 21
MUL
LOAD v[20 + i] # indirection
LOAD v[11]
STORE v[12]
INC i # increment index register

# operation 22
ADD
LOAD v[12]
LOAD v[13]
STORE v[13]

# operation 23
SUB
LOAD v[10]
LOAD v[1]
STORE v[10]

# branch if non-zero
BRANCH NON-ZERO (operation 13)

# operation 24
SUB
LOAD v[20 + i] # indirection
LOAD v[13]
STORE v[20 + i] # indirection

# the result can be output at this point
PRINT

# operation 25
ADD
LOAD v[1]
LOAD v[3]
STORE v[3]

# reset working variables
SET v[7] 0
SET v[13] 0

# repeat
# (or if v[3] < N)
BRANCH ALWAYS (operation 0)

HALT

Each program line would correspond to a punched card that would be fed to the Analytical Engine, so this is the closest we can get to what the original stack of punched cards, and hence the first program, would have looked like.

Note: Whilst writing up this article I have found a lot more information out about the Analytical Engine than when I started, and there is much more for me to go through. I intend to investigate the Analytical Engine further, and I will correct any glaring errors written above as I become more familiar with the subject. Please let me know of any problems you find and I will endeavour to sort them out.

— J.M.R.

Here are some useful resources that I have found about Ada, Babbage and the Analytical Engine:


[¹] Apologies if you can’t access these programmes. BBC iPlayer is only available in the UK, and programmes are only available for a limited time.

[²] Ada’s indices for the Bernoulli Numbers are one less than the indices now generally used.

5 responses to “Running the first program

  1. Jim Randell 24 September 2015 at 5:23 pm

    It occurred to me that it is possible to write Ada’s program without the need for full indirection. The input data for a single run of the algorithm are the Bernoulli Numbers already calculated. These data are accessed in order, sequentially and each item only once. A stack of cards is sequentially accessed memory. So if an extra stack of “input data” cards were provided composed of the already calculated Bernoulli numbers, then they would provide the correct values to the program in the right order as it ran. When they were all processed the Engine would print out the next Bernoulli number on a card, which would be added to the input data stack for the next run. All that would be required would be a “LOAD input” instruction to load the register in the mill from the current input data card. After the card was read the next input data card would be available for the next “LOAD input” instruction.

    Here’s the Python transliteration:

    from collections import defaultdict
    from fractions import Fraction
    
    # Ada Lovelace's algorithm in Python
    def bernoulli():
    
      # input data is initially empty
      input_data = list()
    
      # initially all varibles in the store are at 0
      v = defaultdict(int)
    
      # program start
    
      # initialise the variables
      v[1] = Fraction(1) # constant
      v[2] = Fraction(2) # constant
      v[3] = Fraction(1) # n
    
      # outer loop (compute B[2n - 1])
      while True:
    
        # pseudo-block to permit "break"
        while True:
    
          # 0: prepare the input data stack
          data = iter(input_data)
    
          # 1: v4 = v5 = v6 = 2n
          v[4] = v[5] = v[6] = v[2] * v[3]
    
          # 2: v4 = 2n - 1
          v[4] = v[4] - v[1]
    
          # 3: v5 = 2n + 1
          v[5] = v[5] + v[1]
    
          # 4: v11 = (2n - 1)/(2n + 1) (the diagram seems to say v[5] / v[4])
          v[11] = v[4] / v[5]
    
          # 5: v11 = (1/2) (2n - 1)/(2n + 1)
          v[11] = v[11] / v[2]
    
          # 6: v13 = -(1/2) (2n - 1)/(2n + 1) = A0
          v[13] = v[13] - v[11]
    
          # 7: v10 = n - 1
          v[10] = v[3] - v[1]
          # branch if zero to operation 24
          if v[10] == 0: break
    
          # 8: v7 = 2
          v[7] = v[2] + v[7]
    
          # 9: v11 = (2n)/2 = A1 [why not just set v[11] = v[3] instead of 8 & 9]
          v[11] = v[6] / v[7]
    
          # 10: v12 = A1 * B1
          v[12] = next(data) * v[11] # use a value from the input data
    
          # 11: v13 = A0 + A1 * B1
          v[13] = v[12] + v[13]
    
          # 12: v10 = n - 2
          v[10] = v[10] - v[1]
          # branch if zero to operation 24
          if v[10] == 0: break
    
          # for each computed result in the input data = B3 [1], B5 [2], ...
          while True:
    
            # 13: v6 = 2n - 1 [1], 2n - 3 [2], ...
            v[6] = v[6] - v[1]
    
            # 14: v7 = 3 [1], 5 [2], ...
            v[7] = v[1] + v[7]
    
            # 15: v8 = (2n - 1)/3 [1], (2n - 3)/5 [2], ...
            v[8] = v[6] / v[7]
    
            # 16: v11 = (2n)/2 * (2n - 1)/3 [1], (2n)/2 * (2n - 1)/3 * (2n - 3)/5 [2], ...
            v[11] = v[8] * v[11]
    
            # 17: v6 = 2n - 2 [1], 2n - 4 [2], ...
            v[6] = v[6] - v[1]
    
            # 18: v7 = 4 [1], 6 [2], ...
            v[7] = v[1] + v[7]
    
            # 19: v9 = (2n - 2)/4 [1], (2n - 4)/6
            v[9] = v[6] / v[7]
    
            # 20: v11 = (2n)/2 * (2n - 1)/3 * (2n - 2)/4 = A3 [1], (2n)/2 * (2n - 1)/3 * (2n - 2)/4 * (2n - 3)/5 * (2n - 4)/6 = A5 [2], ... 
            v[11] = v[9] * v[11]
    
            # 21: v12 = A3 * B3 [1], A5 * B5 [2], ...
            v[12] = next(data) * v[11] # use a value from the input data
    
            # 22: v13 = A0 + A1 * B1 + A3 * B3 [1], A0 + A1 * B1 + A3 * B3 + A5 * B5 [2], ...
            v[13] = v[12] + v[13]
    
            # 23: v10 = n - 3 [1], n - 4 [2], ...
            v[10] = v[10] - v[1]
            # branch if non-zero to operation 13
            if v[10] == 0: break
    
          break
    
        # 24: result (-v13)
        r = - v[13]
    
        # the result would be printed to a card here
        # we add it to the input data for the next iteration
        input_data.append(r)
    
        # we also pass it back to the calling routine
        yield r
    
        # 25: increase n, and reset working variables
        v[3] = v[1] + v[3]
        v[7] = v[13] = 0
    
        # branch to operation 0
    
    
    # run N iterations of the algorithm
    N = 10
    
    # allow N to be specified as a command line argument
    import sys
    if len(sys.argv) > 1:
      N = int(sys.argv[1])
    
    k = 1 # Ada's numbering (normal numbering is k + 1)
    K = 2 * N + 1 # max k
    for r in bernoulli():
      print("B[{k}] = {r}".format(k=k, r=r))
      k += 2
      if k == K: break
    

    And the corresponding assembly language for the Analytical Engine:

    # initialise the variables (only done on the first run)
    [init]
    SET v[0] 0 # constant 0
    SET v[1] 1 # constant 1
    SET v[2] 2 # constant 2
    SET v[3] 1 # n = 1
    
    # program start
    [start]
    
    # operation 1
    MUL
      LOAD v[2]
      LOAD v[3]
      STORE v[4]
      STORE v[5]
      STORE v[6]
    
    # operation 2
    SUB
      LOAD v[4]
      LOAD v[1]
      STORE v[4]
    
    # operation 3
    ADD
      LOAD v[5]
      LOAD v[1]
      STORE v[5]
    
    # operation 4
    DIV
      LOAD v[4]
      LOAD v[5]
      STORE v[11]
    
    # operation 5
    DIV
      LOAD v[11]
      LOAD v[2]
      STORE v[11]
    
    # operation 6
    SUB
      LOAD v[13]
      LOAD v[11]
      STORE v[13]
    
    # operation 7
    SUB
      LOAD v[3]
      LOAD v[1]
      STORE v[10]
    
    # branch if zero
    BRANCH ZERO (operation 24)
    
    # operation 8
    ADD
      LOAD v[2]
      LOAD v[7]
      STORE v[7]
    
    # operation 9
    DIV
      LOAD v[6]
      LOAD v[7]
      STORE v[11]
    
    # operation 10
    MUL
      LOAD input
      LOAD v[11]
      STORE v[12]
    
    # operation 11
    ADD
      LOAD v[12]
      LOAD v[13]
      STORE v[13]
    
    # operation 12
    SUB
      LOAD v[10]
      LOAD v[1]
      STORE v[10]
    
    # branch if zero
    BRANCH ZERO (operation 24)
    
    # operation 13
    SUB
      LOAD v[6]
      LOAD v[1]
      STORE v[6]
    
    # operation 14
    ADD
      LOAD v[1]
      LOAD v[7]
      STORE v[7]
    
    # operation 15
    DIV
      LOAD v[6]
      LOAD v[7]
      STORE v[8]
    
    # operation 16
    MUL
      LOAD v[8]
      LOAD v[11]
      STORE v[11]
    
    # operation 17
    SUB
      LOAD v[6]
      LOAD v[1]
      STORE v[6]
    
    # operation 18
    ADD
      LOAD v[1]
      LOAD v[7]
      STORE v[7]
    
    # operation 19
    DIV
      LOAD v[6]
      LOAD v[7]
      STORE v[9]
    
    # operation 20
    MUL
      LOAD v[9]
      LOAD v[11]
      STORE v[11]
    
    # operation 21
    MUL
      LOAD input
      LOAD v[11]
      STORE v[12]
    
    # operation 22
    ADD
      LOAD v[12]
      LOAD v[13]
      STORE v[13]
    
    # operation 23
    SUB
      LOAD v[10]
      LOAD v[1]
      STORE v[10]
    
    # branch if non-zero
    BRANCH NON-ZERO (operation 13)
    
    # operation 24
    SUB
      LOAD v[0]
      LOAD v[13]
    
    # the result is printed to a card at this point
    PRINT
    
    # operation 25
    ADD
      LOAD v[1]
      LOAD v[3]
      STORE v[3]
    
    # reset working variables
    SET v[7] 0
    SET v[13] 0
    
    # the program can be restarted when the input stack has
    # been reloaded and the output card added in to it
    HALT
    

    We use “LOAD input” to access cards from the input data stream in operation 10 and operation 21. And we no longer store the result in the store in operation 24.

    On each run the Engine will print a card with next Bernoulli number in the sequence on it. The control and input cards are reset and the newly printed output card is added to the end of the input stack. The Engine is then ready to run the next iteration of the algorithm with exactly the same control cards as before.

    Theoretically this method would allow you to calculate as many Bernoulli numbers as you had cards to print them out onto, regardless of the number of variables in the store (as long as there are at least 14). But if the 1,000 variable Analytical Engine had ever been built you would probably run out of accuracy, or time, before you exhausted the storage available.

  2. Hugh Casement 28 September 2015 at 6:33 pm

    Clever lass, our Ada.  What a pity Babbage’s plan was so overambitious and he didn’t start small and work up, like Konrad Zuse.  Few people have heard of him, but he was ahead of the Americans with their ENIAC.  His first model was constructed in his bedroom at his parents’ home, painstakingly cutting and filing pieces of metal to form latches.  He saw from the start that binary is the way to go.  It’s perhaps an interesting reflection on different national characteristics that during the war the Germans were using his computer (a later model with electromechanical relays) to help design aircraft wing sections; it didn’t occur to them to try cracking ciphers that way!

    What were they intending to do with all those Bernoulli numbers?  My 50-year-old copy of Mathematical Methods for Science Students gives an utterly unpractical formula but makes no mention of what use they might be.  I believe the terms of series expansions for tan and cot can be expressed using Bernoulli numbers.  Do they have other applications?

  3. Jim Randell 20 August 2018 at 11:35 am

    This article at twobithistory.org goes into more detail on the origins of Ada’s program, and the uses of Bernoulli Numbers, and includes a transliteration of Ada’s program into C. (As well as being kind enough to reference my own musings on the subject).

  4. Simon Wright 24 August 2018 at 4:57 pm

    Hi!

    With reference to the need to reset V13 to 0, I think this is already covered. At step 24, the ‘indication of change’ column includes ‘4V13 = 0V13′, which means that V13, which previously held its 4th assigned value, now holds its initial value (0). This is achieved because transferring a value from the store to the mill was destructive (the store digits would be rolled back to zero as the ingress axis was rolled forward to the required value). There was an option to copy the digits to a temporary column, and then copy back to the store, thus resetting the store to its original value.

    There is a slightly handwavey discussion in the Sketch, which promises considerably more complication in where the values get copied back to!

    “These coefficients will be inscribed on the columns V0 and V4. If we commence the series of operations by the product of m into n’, these numbers will be effaced from the columns V0 and V4, that they may be transferred to the mill, which will multiply them into each other, and will then command the machine to represent the result, say on the column V6. But as these numbers are each to be used again in another operation, they must again be inscribed somewhere; therefore, while the mill is working out their product, the machine will inscribe them anew on any two columns that may be indicated to it through the cards; and as, in the actual case, there is no reason why they should not resume their former places, we will suppose them again inscribed on V0 and V4, whence in short they would not finally disappear, to be reproduced no more, until they should have gone through all the combinations in which they might have to be used.”

    I had fun translating the Fourmilab Java to Ada, at https://github.com/simonjwright/analytical-engine. To transfer V13 to the ingress axis, the code (variable card) would be Z013 to leave V13 zero, L013 to leave its value unchanged.

    • Jim Randell 25 August 2018 at 3:08 pm

      Hi Simon. Thanks for your comment.

      I did make some simplifications in my interpretation of the workings of the Analytical Engine in order to keep the article to a reasonable length, and one of those simplifications was to ignore the “destructive read” mechanism in favour of variables in the store that retain their values when they are read. I had read that there was a plan for the AE where “variable cards” would have an option that could be selected to have the value copied both to the mill and to a temporary column from which the original value of the variable could then be restored while the operation was in progress. And I took this to mean that I didn’t really have to worry about destructive reads in my analysis of Ada’s program, and I always used “retaining reads”.

      Of course it’s not quite that simple, and as you point out, in column 5 of the table, Ada does explain what happens to the operand variables (although not in the case of operation 9 and operation 21, but in these cases we can work it out from the other columns), and although I treated column 5 as a comment, it is necessary to explain if the variables are being read destructively or if they should retain their values.

      I have further read that when storing a computed value into a variable in the store that variable must have a zero value or the attempt to store the new value would jam the machine. This gives a very good reason for clearing out variables when you are finished with them (actually you would be restoring their values only when you needed them again, otherwise leaving them as zero), so they could be reused. (And I would have imagined that an implementation of the AE might want to detect an attempt to store to a non-zero column and HALT processing, if a jam is as nasty as it sounds. Which would be an early implementation of an Exception).

      To this end I have added a [[ LOADZ ]] opcode to my implementation of the AE (originally given in Part 2 of my article), which allows for a variable to be destructively read, and also a [[ warn ]] flag on the [[ AnalyticalEngine() ]] class to enable a warning if an attempt is made to store to a non-zero variable.

      I then implemented my Program 4 [[ ada5.py ]] to use the [[ LOADZ ]] opcode where the value is not required to be retained across the operation, and also to reset variables 6 and 7 at the end of the program (as per Ada’s table) – variable 13 being reset to zero by a destructive read in operation 24.

      In doing this I found a further couple of problems:

      Firstly, I found that for operation 21, column 5 of the table does not say what happens to variable 11, but the other columns of the table imply that variable 11 should zeroed. However, I found that if a “destructive read” for variable 11 was used here then the program did not function correctly, but if a “retaining read” was used then the program does produce the correct output, but also generates a warning that attempts are made to overwrite a non-zero value in variable 11 on subsequent executions of the program.

      I solved this by using a non-destructive read of variable 11 in operation 21 and then adding an additional instruction to reset variable 11 at the end of the program (as well as variables 6 and 7), then the program produces correct values, without any overwrite warnings.

      So I think there is still some issue (if not the one I originally raised) with which variables need to be reset to prepare the program for the next run, and in the case where reads are destructive and overwrites of non-zero values are not allowed Ada should have specified variables 6, 7, and 11 should be reset.

      Here is my revised version of [[ ada5.py ]] (uploaded as: [[ ada5z.py ]]):

      # alternative implementation of ada5.py using the LOADZ opcode where appropriate
      
      from analytical_engine import AnalyticalEngine, Column
      from fractions import Fraction
      from enigma import raw_input, printf
      
      # initialise the engine
      #ae = AnalyticalEngine(vars=14, number=Column(digits=10, dp=40), trace=0, warn=1)
      ae = AnalyticalEngine(vars=14, number=Fraction, trace=0, warn=1)
      
      # load the program
      ae.load_program([
        # initialisation
        ['SET', 0, 0],
        ['SET', 1, 1],
        ['SET', 2, 2],
        ['SET', 3, 1],
        # operation 1
        ['MUL'],
        ['LOAD', 2],
        ['LOAD', 3],
        ['STORE', 4],
        ['STORE', 5],
        ['STORE', 6],
        # operation 2
        ['SUB'],
        ['LOADZ', 4],
        ['LOAD', 1],
        ['STORE', 4],
        # operation 3
        ['ADD'],
        ['LOADZ', 5],
        ['LOAD', 1],
        ['STORE', 5],
        # operation 4
        ['DIV'],
        ['LOADZ', 4],
        ['LOADZ', 5],
        ['STORE', 11],
        # operation 5
        ['DIV'],
        ['LOADZ', 11],
        ['LOAD', 2],
        ['STORE', 11],
        # operation 6
        ['SUB'],
        ['LOADZ', 13],
        ['LOADZ', 11],
        ['STORE', 13],
        # operation 7
        ['SUB'],
        ['LOAD', 3],
        ['LOAD', 1],
        ['STORE', 10],
        # branch if zero to operation 24
        ['BRZ', +66],
        # operation 8
        ['ADD'],
        ['LOAD', 2],
        ['LOADZ', 7],
        ['STORE', 7],
        # operation 9
        ['DIV'],
        ['LOAD', 6],
        ['LOAD', 7],
        ['STORE', 11],
        # operation 10
        ['MUL'],
        ['LOAD_DATA'],
        ['LOAD', 11],
        ['STORE', 12],
        # operation 11
        ['ADD'],
        ['LOADZ', 12],
        ['LOADZ', 13],
        ['STORE', 13],
        # operation 12
        ['SUB'],
        ['LOADZ', 10],
        ['LOAD', 1],
        ['STORE', 10],
        # branch if zero to operation 24
        ['BRZ', +45],
        # operation 13
        ['SUB'],
        ['LOADZ', 6],
        ['LOAD', 1],
        ['STORE', 6],
        # operation 14
        ['ADD'],
        ['LOAD', 1],
        ['LOADZ', 7],
        ['STORE', 7],
        # operation 15
        ['DIV'],
        ['LOAD', 6],
        ['LOAD', 7],
        ['STORE', 8],
        # operation 16
        ['MUL'],
        ['LOADZ', 8],
        ['LOADZ', 11],
        ['STORE', 11],
        # operation 17
        ['SUB'],
        ['LOADZ', 6],
        ['LOAD', 1],
        ['STORE', 6],
        # operation 18
        ['ADD'],
        ['LOAD', 1],
        ['LOADZ', 7],
        ['STORE', 7],
        # operation 19
        ['DIV'],
        ['LOAD', 6],
        ['LOAD', 7],
        ['STORE', 9],
        # operation 20
        ['MUL'],
        ['LOADZ', 9],
        ['LOADZ', 11],
        ['STORE', 11],
        # operation 21
        ['MUL'],
        ['LOAD_DATA'],
        ['LOAD', 11],
        ['STORE', 12],
        # operation 22
        ['ADD'],
        ['LOADZ', 12],
        ['LOADZ', 13],
        ['STORE', 13],
        # operation 23
        ['SUB'],
        ['LOADZ', 10],
        ['LOAD', 1],
        ['STORE', 10],
        # branch if non-zero to operation 13
        ['BRN', -45],
        # operation 24
        ['SUB'],
        ['LOAD', 0],
        ['LOADZ', 13],
        # print
        ['PRINT'],
        # operation 25
        ['ADD'],
        ['LOAD', 1],
        ['LOADZ', 3],
        ['STORE', 3],
        # reset working variables
        ['SET', 6, 0],
        ['SET', 7, 0],
        ['SET', 11, 0], # stops overwrite warnings
        # end
        ['HALT'],
      ])
      
      # indices B[k]
      k = 1
      # input data, initially empty, but each result is added after computation
      data = []
      # instruction to start execution at, initially 0, but subsequently 4
      start = 0
      # run the program
      while True:
        # load the data and run the program
        ae.load_data(data)
        ae.run(start)
        # get the computed result from the output transcript
        r = (ae.output[-1] if ae.output else None)
      
        # display the computed result
        printf("B[{k}] = {r}")
      
        # run the program again?
        try:
          raw_input('\n[paused] >>> ')
        except EOFError:
          printf()
          break
      
        # add the result to the data and run it again (from instruction 4)
        data.append(r)
        start = 4
        k += 2
      

      I’ve also updated the assembler I wrote (for Part 3 of the article) to produce code that uses the [[ LOADZ ]] opcode if a variable is referenced with a trailing dot (i.e. Ada’s operation 4 would be: [[ DIV 4. 5. -> 11 ]]).

      I feel I might need to write a Part 4 on the subject.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: