I must hasten to point out that this is *not* actually a metacompiler! Rather it's a self-recognizing grammar and translator that could be extended to become a (meta-)compiler.
You might be familiar with META-II:
→ Val Schorre's 1963 MetaII (Wikipedia)Implementations:
→ https://github.com/advancingdragon/metaIt turns out that there's an even smaller self-recognizing system!
I found this comment on Stackoverflow:
A clever fellow named Doug Michels, associated a long time ago with the 1980s (Unix supplier) Santa Cruz Operation, told me that he had gone considerably further and reduced the metacompiler self description to a very small number of characters. I never saw the work so I don't really know how far he got.
[EDIT] Dig, dig, dig... found this gem (on Linkedin):
Bill McKeeman, Adjunct Faculty at Dartmouth said:
> Doug was my undergraduate student; his senior thesis assignment was simple: write the shortest, extendable, self-compiling compiler. The front end took 27 characters; the whole thing took 63. It all fit on one IBM card. He published the result.
Dig, dig, dig some more: This seems to be Doug's 27 character paper. See Figure 2. By "front end", McKeeman apparantly means "just the parser"; the paper contains full translators that are a little larger.→ Comment by Ira Baxter in 2016
It's "A Concise Extensible Metalanguage for Translator Implementation", by Douglas L. Michels, 3rd USA-JAPAN Computer Conference, 1978.
The link to the paper gives me a 403 "Forbidden" response:
→ http://www.cs.dartmouth.edu/~mckeeman/cs48/mxcom/gem/html/michels1978.pdfBut I found a project called "Growing a compiler in matlab" on something called freesourcecode.net:
→ Growing a compiler in matlabThat provides a zipfile:
→ http://freesourcecode.net/sites/default/files/72297.zipAnd in that zip file there is a copy of the PDF. (TODO: upload it to archive and add a link here.)
Without further ado:
l ". l & l "& "l & . . & "" l "& l "l l "" ".
There are two binary prefix operators ``l`` and ``&`` for alternation and concatination respectively, a ``"`` unary prefix recognizer operator, and the ``.`` operator for recursion on the whole grammar.
go ::= l go go | & go go | " a | .
Miles specifies a simple interpreter for GO grammars in a pseudo-code. It's not very efficient, as Michels points out:
Fay [2] has demonstrated that a direct implementation of the interpreter executes painfully slow. He provides an example of a similar implementation that is easily implemented and has reasonable execution characteristics.
The reference [2] is to:
Fay, Michael, Bootstrapping a Small Translator Writing System, UCSC technical report 77-3-002, March 1977.
In any event, here's an implementation in Prolog (note that ``|`` is substituted for ``l``):
mo(Grammar, Input) :- mo(Grammar, Grammar, Input, []).
mo(['.'|_], Grammar) --> mo(Grammar, Grammar).
mo(['&'|Rest0], Grammar) --> mo(Rest0, Grammar), {skip(Rest0, Rest)}, mo(Rest, Grammar).
mo(['|'|Rest], Grammar) --> mo(Rest, Grammar).
mo(['|'|Rest0], Grammar) --> {skip(Rest0, Rest)}, mo(Rest, Grammar).
mo(['"', I|_Rest], _Grammar) --> [I].
skip --> ['.'].
skip --> ['&'], skip, skip.
skip --> ['|'], skip, skip.
skip --> ['"', _].
I feel like that shows very clearly the simple structure of the thing. You've got sequence (``&``), branch (``|``), and looping, (``.``); the only construct that actually consumes input is the quote operator (``"``); and skip is much cleaner thanks to DCG notation.
Something about the emit construct...
Rather than create a metacompiler in GO (which would be fun) let's instead use the META-II metacompiler and define a grammar fo GO in that.