Recently Nicholas Carlini won the (IOCCC) International Obfuscated C Code Contest for
...a feature-complete 4004 emulator that’s capable of emulating the original Busicom 141-pf calculator ROM ... this program is actually just a logic gate simulator, with the entire 4004 encoded as a circuit that’s then emulated.→ https://www.ioccc.org/2024/carlini/index.html
Wow! He wrote a blog post about it:
→ Gate-level emulation of an Intel 4004 in 4004 bytes of CAnd a separate blog post about the logic gate simulator:
→ miniHDL: A Python Hardware Description Language DSLReading it reminded me that I had done something similar (to the logic gate simulator, not to the whole calculator emulator) about a decade ago when I was learning about symbolic logic in the context of George Spencer-Brown's "Law of Form". I stopped just after developing a very simple "computer" that could generate and display a Sierpiński gasket. Here's that story...
In the beginning was the Void, And the Void was without Form.
In 1969 a strange book was published. Some folks dismiss it as a trivial restatement of binary Boolean logic, others find in it an almost mystical revelation.
→ "Laws of Form", George Spencer-Brown, 1969Binary Boolean logic requires two symbols, typically 0 and 1. The logic of George Spencer-Brown uses existence and non-existence instead. To make this concrete we shall implement the Laws of Form logic in Python using strings of balanced pairs of parentheses, a Dyck language:
In the theory of formal languages ... a Dyck word is a balanced string of brackets. The set of Dyck words forms a Dyck language. The simplest, Dyck-1, uses just two matching brackets, e.g. ( and ).→ Dyck language
Our logic machine uses the existence of a empty pair of parentheses as one value and nothing (in the form of the so-called "empty" string) as the other value. We will label one of these "true" and the other "false":
T = '()'
F = ''
(This choice is somewhat arbitrary in the sense that the logic works the same either way, just that if you assign '()' to "false" and '' to "true" you must exchange the names of the "or_" and "and_" functions and the "nor" and "nand" functions (below) to reflect the inversion of meaning, "xor" and "equiv" remain the same of course. The only advantage of doing it this way is that the '()' and '' in Python evaluate in Boolean contexts to True and False respectively.)
Now that we have done that we can write some functions (taking advantage of some nifty Python features) to implement logical operations. Without further ado:
def or_(*bits): return ''.join(bits)
def nor(*bits): return f'({or_(*bits)})'
def and_(*bits): return f'({nand(*bits)})'
def nand(*bits): return or_(*(f'({bit})' for bit in bits))
def xor(a, b): return f'({a}({b}))(({a}){b})'
def equiv(a, b): return f'({xor(a, b)})'
We will use these in a moment to construct logical expressions but first let's examine how they are reduced to simplest form. There are two rules:
()() = ()
(()) =
These rules can be expressed in a poetic way:
The Mark repeated is the Mark.
The Unmarked Mark is the Void.
Anyway, they are expressed in the function:
reduction_step = lambda form: (
form
.replace('()()', '()')
.replace('(())', '')
)
Which must be run repeatedly until no more replacements can be done. We'll use a generic higher-order function that does that:
def to_fixed_point(function, term0):
while True:
term1 = function(term0)
if term0 == term1:
break
term0 = term1
return term1
With this and the reduction_step function we create a function "reduce_" (the trailing underscore prevents it from shadowing the built-in "reduce" function) which when given a well-formed logical expression will reduce it to either T or F:
from functools import partial as curry
reduce_ = curry(to_fixed_point, reduction_step)
Perhaps amazingly, this is a complete logical calculator. E.g.:
assert reduce_(or_(T, T)) == T
assert reduce_(or_(F, T)) == T
assert reduce_(or_(T, F)) == T
assert reduce_(or_(F, F)) == F
assert reduce_(or_(F, F, F)) == F
assert reduce_(or_(F, T, F)) == T
assert reduce_(and_(T, T)) == T
assert reduce_(and_(F, T)) == F
assert reduce_(and_(T, F)) == F
assert reduce_(and_(F, F)) == F
assert reduce_(and_(T, T, T)) == T
assert reduce_(and_(T, F, T)) == F
# Etc.
All of these assertions succeed.
We will use single-character symbols to represent variables in our expressions. Let's make a handful:
a, b, c, d, e, f, g = 'abcdefg'
Using these we can create expressions such as this fragment of an adder circuit:
E = xor(xor(a, b), c)
That results in this formula:
E = ((a(b))((a)b)(c))(((a(b))((a)b))c)
Given such a formula (aka "form", aka "expression") we can substitute values for the symbols using a dictionary and a Regular Expression:
import re
env = {a:F, b:F, c:F}
Q = re.sub('|'.join(env), lambda match: env[match.group()], E)
This gives us a Dyck language string with no symbols in it. We can apply the reduction function to get either '()' or ''.
Q = ((())(())())(((())(())))
Then:
print(reduce_(Q) == T)
Will print "False". In this case that proves that 0 + 0 + 0 = 0. We're on solid ground.
We can try all the possible Boolean values for each variable in an expression and reduce to find the solutions of the expression.
We start with creating the universe of possible values for a variable (just the pair (F, T).)
B = F, T
Then we try each possible assignment of values to the variables.
from itertools import product
def solve_brute_force(form):
symbols = set(form) - {'(', ')'}
regex = '|'.join(symbols)
for values in product(*(B for _ in symbols)):
env = dict(zip(symbols, values))
expr = re.sub(regex, lambda match: env[match.group()], form)
if reduce_(expr):
yield env
for solution in solve_brute_force(E):
print(solution)
This prints out the dictionaries of value assignments for the symbols which make the expression reduce to '()':
{'c': '', 'a': '', 'b': '()'}
{'c': '', 'a': '()', 'b': ''}
{'c': '()', 'a': '', 'b': ''}
{'c': '()', 'a': '()', 'b': '()'}
We can convert the set of environment dictionaries to a disjunctive normal form (DNF) of the initial expression:
→ Disjunctive Normal FormWe convert the dictionaries to AND clauses and OR those together:
def env_to_form(env):
return nor(*(
nor(symbol) if env[symbol] else symbol
for symbol in sorted(env)
))
def DNF(form):
return nor(*sorted(map(env_to_form, solve_brute_force(form))))
E = ((a(b))((a)b)(c))(((a(b))((a)b))c)
DNF(E) = ((a)(b)(c))((a)bc)(a(b)c)(ab(c))
So that's neat.
It's possible to use this machinery to build models of digital electronic circuits up to and including whole calculators, as indeed Nicholas Carlini has done. I never got as far as that, but for the sake of curiousity here's an adder circuit fragment:
def full_bit_adder(a, b, c):
return (
xor(xor(a, b), c),
or_(and_(a, b), and_(c, xor(a, b))),
)
I got the formulas from Wikipedia, but you can see it's identical to the the version Carlini uses, except that that eliminates duplicate work by creating a graph with intermediate sub-expressions rather than a tree (paraphrased into Python for ease of comparision):
def full_adder(a, b, c0):
d = xor(a, b)
e = xor(d, c0)
c = or_(
and_(a, b),
and_(d, c0),
)
return e, c
My version repeats "xor(a, b)" while his assigns that to an intermediate variable "d".
Anyway, if you want to see how to build a normal computer out of this you should read Carlini's blog posts.
→ miniHDL: A Python Hardware Description Language DSLI went a different way. I made a sort of ultra-parallel computer in which every bit of state (represented as a single unified register file and RAM) is updated all at once on each cycle. It's more of a schematic than something one would actually build, but it's enough to capture the fundamental idea of a binary Boolean digital computer.
Start with what I think of as the "universe" of the state of the computer, the set of all names:
U = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
The state of the computer is kept in a single register (that is also the RAM) that holds one bit per name, all initialized to F:
R = {name: F for name in U}
A program in this system is a mapping from names in the universe to expression in the Laws of Form notation. E.g. this is a simple cellular automata program that just XORs each bit with its neighbor, wrapping at the edge.
P = {
name: xor(name, U[(i + 1) % 52])
for i, name in enumerate(U)
}
Lets have a simple linear display for our computer. We'll print a '.' or an '@' to indicate F or T:
def show(reg):
print(''.join('@.'[not reg[name]] for name in U))
With no bits in the register nothing would happen, so let's set one bit to T before we begin.
R['z'] = T
Now to run the machine we evaluate each named expression in the program in the context or environment of the current values in the register to get the new values for each register bit. In pseudo-code:
Rₜ₊₁ = reduce(subst(P, Rₜ))
In code:
subst = curry(
re.sub,
'[A-Za-z]', # Use knowledge of U to write more efficient(?) RE.
lambda match: R[match.group()],
)
# Run through some cycles.
for _ in range(100):
show(R)
R.update((name, reduce_(subst(P[name]))) for name in P)
And here's our lovely Sierpiński Gasket:
...................................................@ ..................................................@@ .................................................@.@ ................................................@@@@ ...............................................@...@ ..............................................@@..@@ .............................................@.@.@.@ ............................................@@@@@@@@ ...........................................@.......@ ..........................................@@......@@ .........................................@.@.....@.@ ........................................@@@@....@@@@ .......................................@...@...@...@ ......................................@@..@@..@@..@@ .....................................@.@.@.@.@.@.@.@ ....................................@@@@@@@@@@@@@@@@ ...................................@...............@ ..................................@@..............@@ .................................@.@.............@.@ ................................@@@@............@@@@ ...............................@...@...........@...@ ..............................@@..@@..........@@..@@ .............................@.@.@.@.........@.@.@.@ ............................@@@@@@@@........@@@@@@@@ ...........................@.......@.......@.......@ ..........................@@......@@......@@......@@ .........................@.@.....@.@.....@.@.....@.@ ........................@@@@....@@@@....@@@@....@@@@ .......................@...@...@...@...@...@...@...@ ......................@@..@@..@@..@@..@@..@@..@@..@@ .....................@.@.@.@.@.@.@.@.@.@.@.@.@.@.@.@ ....................@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ ...................@...............................@ ..................@@..............................@@ .................@.@.............................@.@ ................@@@@............................@@@@ ...............@...@...........................@...@ ..............@@..@@..........................@@..@@ .............@.@.@.@.........................@.@.@.@ ............@@@@@@@@........................@@@@@@@@ ...........@.......@.......................@.......@ ..........@@......@@......................@@......@@ .........@.@.....@.@.....................@.@.....@.@ ........@@@@....@@@@....................@@@@....@@@@ .......@...@...@...@...................@...@...@...@ ......@@..@@..@@..@@..................@@..@@..@@..@@ .....@.@.@.@.@.@.@.@.................@.@.@.@.@.@.@.@ ....@@@@@@@@@@@@@@@@................@@@@@@@@@@@@@@@@ ...@...............@...............@...............@ ..@@..............@@..............@@..............@@ .@.@.............@.@.............@.@.............@.@ @@@@............@@@@............@@@@............@@@@ ...@...........@...@...........@...@...........@.... ..@@..........@@..@@..........@@..@@..........@@.... .@.@.........@.@.@.@.........@.@.@.@.........@.@.... @@@@........@@@@@@@@........@@@@@@@@........@@@@.... ...@.......@.......@.......@.......@.......@...@...@ ..@@......@@......@@......@@......@@......@@..@@..@@ .@.@.....@.@.....@.@.....@.@.....@.@.....@.@.@.@.@.@ @@@@....@@@@....@@@@....@@@@....@@@@....@@@@@@@@@@@@ ...@...@...@...@...@...@...@...@...@...@............ ..@@..@@..@@..@@..@@..@@..@@..@@..@@..@@............ .@.@.@.@.@.@.@.@.@.@.@.@.@.@.@.@.@.@.@.@............ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@............ .......................................@...........@ ......................................@@..........@@ .....................................@.@.........@.@ ....................................@@@@........@@@@ ...................................@...@.......@...@ ..................................@@..@@......@@..@@ .................................@.@.@.@.....@.@.@.@ ................................@@@@@@@@....@@@@@@@@ ...............................@.......@...@.......@ ..............................@@......@@..@@......@@ .............................@.@.....@.@.@.@.....@.@ ............................@@@@....@@@@@@@@....@@@@ ...........................@...@...@.......@...@...@ ..........................@@..@@..@@......@@..@@..@@ .........................@.@.@.@.@.@.....@.@.@.@.@.@ ........................@@@@@@@@@@@@....@@@@@@@@@@@@ .......................@...........@...@...........@ ......................@@..........@@..@@..........@@ .....................@.@.........@.@.@.@.........@.@ ....................@@@@........@@@@@@@@........@@@@ ...................@...@.......@.......@.......@...@ ..................@@..@@......@@......@@......@@..@@ .................@.@.@.@.....@.@.....@.@.....@.@.@.@ ................@@@@@@@@....@@@@....@@@@....@@@@@@@@ ...............@.......@...@...@...@...@...@.......@ ..............@@......@@..@@..@@..@@..@@..@@......@@ .............@.@.....@.@.@.@.@.@.@.@.@.@.@.@.....@.@ ............@@@@....@@@@@@@@@@@@@@@@@@@@@@@@....@@@@ ...........@...@...@.......................@...@...@ ..........@@..@@..@@......................@@..@@..@@ .........@.@.@.@.@.@.....................@.@.@.@.@.@ ........@@@@@@@@@@@@....................@@@@@@@@@@@@ .......@...........@...................@...........@ ......@@..........@@..................@@..........@@ .....@.@.........@.@.................@.@.........@.@ ....@@@@........@@@@................@@@@........@@@@
The Laws of Form notation has many more tricks up its sleeve, cf. Markable Mark.
FIN.