Monday, December 7, 2015

A Recursively Enumerable Grammar for Binary Addition

As part of my 2nd semester CS course, I introduce students to the Chomsky hierarchy of grammars. We only go into details about regular grammars and the main learning objective is regular expressions, but I talk about context-free, context-sensitive, and recursively enumerable as well. Part of the reason is to show that there are limits to what you can do with grammars at each level, at least up to recursively enumerable, RE. I tell them that RE grammars are Turing complete, so they can compute anything that any computer could. I often use the example of a factorial because it is a fairly simple function that they are used to coding that doesn't seem at all like it would be doable with grammars.

This year I had a student ask about that on the questions they give me at the end of each class. He said that he couldn't see how a grammar could calculate factorial. I went to be that night trying to think of how to illustrate that a grammar could do math. I knew that I wasn't likely to write factorial, but I figured that if I started with addition might might be enough to show that you can build more complex math. The next question was how to represent that numbers. Addition in strings that use a unary format is simple, so I decided to go to binary. As an assignment in the 1st semester class I have students implement addition, multiplication, and exponentiation using only recursion, successor, and predecessor. I decided to take the approach here that I would do for addition there. We say that a+b = (a+1)+(b-1) and have a base case when b is 0. It turns out that you can implement successor and predecessor in binary strings fairly easily.

I put my code for doing this in a GitHub Gist that you can see at the link below. There are 14 productions in the grammar. In addition to the '0', '1', and '+' that I need to represent binary addition, I bracket the numbers with 'N' so that I can tell when I am at the beginning or end of a number. There are a few productions at the end that are used to "clean up" at the end so we get a single number in a string. There are only three productions that are used to create the successor functionality, which is done by putting in a 'S' character. The harder part is getting predecessor because the 'P' character needs to be inserted at the right end of the second number, and when it is done, we have to have that information sent back to the plus sign to say that the current operation is done. So while there are only two productions labeled as predecessor, the left and right message productions are also part of making the predecessor work.

Gist of the Code To see how this works, I ran it and entered the numbers 5 and 6. You can see the output of that here. The first line shows that we are adding 5 and 6. Each line below that is the application of a single production. The ones that don't have a ' after the plus sign are completed steps in the addition algorithm. The last line is printing the final result, which you can see is 11 in binary.
N101N+N110N
N101SN+'NL110N
N101SN+'N1L10N
N10S0N+'N1L10N
N10S0N+'N11L0N
N10S0N+'N110LN
N10S0N+'N110PN
N110N+'N110PN
N110N+'N11P1N
N110N+'N1R01N
N110N+'NR101N
N110N+N101N
N110SN+'NL101N
N110SN+'N1L01N
N111N+'N1L01N
N111N+'N10L1N
N111N+'N101LN
N111N+'N101PN
N111N+'N10R0N
N111N+'N1R00N
N111N+'NR100N
N111N+N100N
N111SN+'NL100N
N111SN+'N1L00N
N111SN+'N10L0N
N11S0N+'N10L0N
N11S0N+'N100LN
N1S00N+'N100LN
N1S00N+'N100PN
NS000N+'N100PN
NS000N+'N10P1N
N1000N+'N10P1N
N1000N+'N1P11N
N1000N+'NR011N
N1000N+N011N
N1000N+N11N
N1000SN+'NL11N
N1000SN+'N1L1N
N1001N+'N1L1N
N1001N+'N11LN
N1001N+'N11PN
N1001N+'N1R0N
N1001N+'NR10N
N1001N+N10N
N1001SN+'NL10N
N1001SN+'N1L0N
N1001SN+'N10LN
N1001SN+'N10PN
N100S0N+'N10PN
N100S0N+'N1P1N
N100S0N+'NR01N
N1010N+'NR01N
N1010N+N01N
N1010N+N1N
N1010SN+'NL1N
N1011N+'NL1N
N1011N+'N1LN
N1011N+'N1PN
N1011N+'NR0N
N1011N+N0N
N1011N+NN
N1011N
N1011N