Programming Thread

Viewing single post

Started by the-pi-guy, Mar 13, 2016, 10:39 PM

previous topic - next topic

the-pi-guy

A compiler works in phases.  

There's the scanning(lexing) phase, the parsing phase, and then the code generation phase.  Technically you can split it up more.  The code generation phase is actually an intermediate phase, which gets followed by an optimization phase, which gets followed by another code generation phase.  That can get even more complicated than that.  

Lexing is done with a simple DFA.  
It's basically set up with 1000 if statements to match the input character by character.  
Read an i, that means it can be either:
-int
-identifier
-if
etc.  
Read an f, that means it can now be either a:
-if
-identifier

Read an = sign, take the input as an if.  But if you read a t, take the input as an identifier.  

Okay, it's probably usually closer to 100 if statements.  (Although it's not hard to imagine it blowing up even more.)

That's the scanning phase.  The output will be something like
if leftparen id equals id rightparen
etc, where each word is some meaningful thing in regards to the input.

Then that gets passed to the parser.  

The parser can be worked a few different ways.  
Some work really nicely with recursion.  Other's don't work as nice.

The way a language is set up is with a grammar.
Something like this:

Statement -> assign    
Statement -> print
assign -> id = value
print -> print(id)

Then the way the parser works, is that it'll basically check if the first thing is a print or an id.  Then it knows what should come next.  


The parser gives a tree for the code generation.  The bottom of the tree is whatever the first thing that needs to be done is.  

The tree is especially useful for statements like this:

x = x+5*y-3

It'll basically change it to something like:
sub(add(x,mul(5,y)),3)

Although it's not exactly obvious how that's a tree.  

Then the code generation uses that to make something like this:
li $t0, 5
lw $t1, y
mul $t1,$t1,$t0
etc.


Even a very simple 3 phase compiler, can quickly become a massive project.  

For the most part, they are not difficult to make.  

But even very simple ones are huge projects.  Very time consuming.

It does get a little tricky, need to set up some things to manage registers and memory locations.  But for the most part, they are time consuming more than difficult.