I made a toy compiler
Nov. 26th, 2009 01:46 amThis was difficult to write so it would include some background for non-programmers, but still be of some interest to real programmers.
There is a turing-machine equivalent based entirely on a simple sequnece of operations on which of a series of fractions divide each other described in http://en.wikipedia.org/wiki/FRACTRAN
On stack overflow, a challenge http://stackoverflow.com/questions/1749905/code-golf-fractran was recently posed for the shortest program which could run those "programs" and a special prize for anyone who wrote an interpreter in those fractions.
Writing a compiler in the same language is a sign that a programming language has "arrived". The first compiler ever had to be written in assembly language because that's why someone was writing a C compiler in the first place, so they didn't always have to code individual machine instructions by hand. But once someone had, then just like any other program, a C compiler is mostly written in C. Writing a fractran interpreter in fractran isn't especiallly useful, but is interesting.
I wanted to do so. I'd never used lex or yacc derivatives before, and felt it was something everyone else knew well and I'd really like to do so. You specify a bit of C and a grammar: a list of tokens like "{" and "+" and any string, and a list of valid compound statements, like "EITHER anything inside { } OR a single statement" and so on. And they automagically stitch it all together into a C program which reads those symbols from the command line, and does whatever you told it to.
It was surprisingly easy. By far the hardest part was getting a mix of automagically generated code and C++ to compile. Otherwise the program mostly just worked.
I started with a compiler for a register machine related to the fractions, and invented a syntax for a simple example thereof. It has an infinite array of registers (though two is sufficient) and three to four instructions:
And made a program which can turn:
into:
And
into:
I was about to add a back end which can turn the instructions into Fractran instructions (it's easy to represent "line number in program", "increase working value", "decrease working value" and "decrease else" with the fractions once you're used to it), and to add enough useful concepts build into the compiler but built out of those instructions, like: an array syntax which automatically compiles to using powers of 2 for first element, multiplied by powers of 3 for second element, etc, etc; program control structures like "if" "while"; variable names, etc.
Unfortunately, someone else had done something similar, and finished it first. So I decided to call the interpreter I'd written a success and come back to it when I had need of something similar, and not spend any more time trying to make it deal with fractions.
There is a turing-machine equivalent based entirely on a simple sequnece of operations on which of a series of fractions divide each other described in http://en.wikipedia.org/wiki/FRACTRAN
On stack overflow, a challenge http://stackoverflow.com/questions/1749905/code-golf-fractran was recently posed for the shortest program which could run those "programs" and a special prize for anyone who wrote an interpreter in those fractions.
Writing a compiler in the same language is a sign that a programming language has "arrived". The first compiler ever had to be written in assembly language because that's why someone was writing a C compiler in the first place, so they didn't always have to code individual machine instructions by hand. But once someone had, then just like any other program, a C compiler is mostly written in C. Writing a fractran interpreter in fractran isn't especiallly useful, but is interesting.
I wanted to do so. I'd never used lex or yacc derivatives before, and felt it was something everyone else knew well and I'd really like to do so. You specify a bit of C and a grammar: a list of tokens like "{" and "+" and any string, and a list of valid compound statements, like "EITHER anything inside { } OR a single statement" and so on. And they automagically stitch it all together into a C program which reads those symbols from the command line, and does whatever you told it to.
It was surprisingly easy. By far the hardest part was getting a mix of automagically generated code and C++ to compile. Otherwise the program mostly just worked.
I started with a compiler for a register machine related to the fractions, and invented a syntax for a simple example thereof. It has an infinite array of registers (though two is sufficient) and three to four instructions:
Inc r4 Dec r1 Goto LabelName Dec r2 else goto labelnameifzero
And made a program which can turn:
Inc r1; inc r1; inc r1; inc r1; Inc r2; Inc r2; Inc r2;
into:
4, 3
And
Four: Inc r1; inc r1; inc r1; inc r1; Three: Inc r2; Inc r2; Inc r2; Multiply: Loop1: dec r1 else goto Tmp1; inc r5; Add: Loop2: dec r2 else goto Tmp2; Inc r3; Inc r4; Goto Loop2; Tmp2: dec r4 else goto Loop1; inc r2; goto Tmp2; Tmp1: dec r5 else goto End; inc r1; goto Tmp1; End: HALT;
into:
4,3,12
I was about to add a back end which can turn the instructions into Fractran instructions (it's easy to represent "line number in program", "increase working value", "decrease working value" and "decrease else" with the fractions once you're used to it), and to add enough useful concepts build into the compiler but built out of those instructions, like: an array syntax which automatically compiles to using powers of 2 for first element, multiplied by powers of 3 for second element, etc, etc; program control structures like "if" "while"; variable names, etc.
Unfortunately, someone else had done something similar, and finished it first. So I decided to call the interpreter I'd written a success and come back to it when I had need of something similar, and not spend any more time trying to make it deal with fractions.