Home >Backend Development >C++ >The tiniest transpiler youll ever see
I wrote a brainf**k to C transpiler this morning. It took me a grand total of roughly an hour.
The entire thing is under 50 lines of C. You can see it here.
It's an esoteric coding language. Invented by a Swiss student in 1993, it's pretty much the minimum required to be considered Turing complete.
It's also one of the most famous esolangs in existence.
The syntax is extremely minimal: it has only 8 characters and the rest are ignored.
Take a guess what that does. Just guess.
It's a Hello, World! program.
Essentially, in brainf**k, you're given a 30000 byte array and a cursor. You can move the cursor with > and <. You can modify the memory with and -, which will increment or decrement the value in the cell. You can create loops with [ and ]. Finally, you can a read a single byte as input with , and print the value of the current cell with ..
That's pretty much everything about brainf**k.
I've written more functional programs in Assembly before.
I wrote this compiler entirely because I was bored and I've found a ton of interpreters, so I thought the world needed a brainf**k compiler.
Although, admittedly, if you want a really good brainf**k compiler, check out
this one.
Many reasons:
And one final reason: integer overflow. Normally, this is a bad thing that people hate. It's probably the reason unit tests (ugh) were invented. But brainf**k is different. The numbers in the memory tape have an upper limit of 255, and if they pass it, they're expected to reset to 0. Also, if the value goes below 0, it's supposed to reset to 255. C does this on its own; I didn't need to write any code for it.
A higher-level overview:
It reads brainf**k code from a file into code[].
Then, it sets up a basic C program:
You may have noticed that it's missing a closing bracket. That's because more code is added to that char[].
In case you're wondering, char t[30000] is the memory that you're given. I used t as short form for tape, but shortened it because these programs aren't meant to be human readable.
Next, it loops over the code array, which is an array of single characters. For each character, it converts it to C code:
character | becomes |
---|---|
> | p |
< | p-- |
- | t[p]-- |
t[p] | |
. | putchar(t[p]) |
, | t[p]=getchar() |
[ | while(t[p] != 0) |
] | } |