lecture #1 began here
Why study compilers?
Most CS students do not go on to write a commercial compiler someday, but
that's not why we study compilers. We study compiler construction for the
following reasons:
The questions we should ask, then, are: (a) should CS majors be required to spend a lot of time becoming really good programmers? and (b) are we providing students with the assistance and access to the tools and information they need to accomplish their goals with the minimal doses of inevitable pain that are required?
There are several major kinds of compilers:
Note: although last year's CS 370 lecture notes are ALL available to you up front, I generally revise each lecture's notes, making additions, corrections and adaptations to this year's homeworks, the night before each lecture. The best time to print hard copies of the lecture notes is one day at a time, right before the lecture is given.
IDENTIFIER=[a-zA-Z][a-zA-Z0-9]*
stdio
#include <stdio.h>This defines a data type
(FILE *)
and gives prototypes for
relevant functions. The following code opens a file using a string filename,
reads the first character (into an int variable, not a char, so that it can
detect end-of-file; EOF is not a legal char value).
FILE *f = fopen(filename, "r"); int i = fgetc(f); if (i == EOF) /* empty file... */
#include <stdio.h> /* cat: concatenate files, version 1 */ int main(int argc, char *argv[]) { FILE *fp; void filecopy(FILE *, FILE *); if (argc == 1) filecopy(stdin, stdout); else while (--argc > 0) if ((fp = fopen(*++argv, "r")) == NULL) { printf("cat: can't open %s\n", *argv); return 1; } else { filecopy(fp, stdout); fclose(fp); } return 0; } void filecopy(FILE *ifp, FILE *ofp) { int c; while ((c = getc(ifp)) != EOF) putc(c, ofp); }Warning: while using and adapting the above code is fair game in this class, the yylex() function is very different than the filecopy() function! It takes no parameters! It returns an integer every time it finds a token! So if you "borrow" from this example, delete filecopy() and write yylex() from scratch. Multiple students have fallen into this trap before you.
foo.o : foo.c gcc -c foo.cThe first line says to build foo.o you need foo.c, and the second line, which must being with a tab, gave a command-line to execute whenever foo.o should be rebuilt, i.e. when it is missing or when foo.c has been changed and need to be recompiled.
The first rule in the makefile is what "make" builds by default, but note that make dependencies are recursive: before it checks whether it needs to rebuild foo.o from foo.c it will check whether foo.c needs to be rebuilt using some other rule. Because of this post-order traversal of the "dependency graph", the first rule in your makefile is usually the last one that executes when you type "make". For a C program, the first rule in your makefile would usually be the "link" step that assembles objects files into an executable as in:
compiler: foo.o bar.o baz.o gcc -o compiler foo.o bar.o baz.oThere is a lot more to "make" but we will take it one step at a time. This article on Make may be useful to you. You can find other useful on-line documentation on "make" (manual page, Internet reference guides, etc) if you look.
lecture #3 began here
category | an integer code used to check syntax |
lexeme | actual string contents of the token |
line, column, file | where the lexeme occurs in source code |
value | for literals, the binary data they represent |
What is the regular expression for each of the different lexical items that appear in C programs? How does this compare with another, possibly simpler programming language such as BASIC?
lexical category | BASIC | C |
---|---|---|
operators | the characters themselves | For operators that are regular expression operators we need mark them with double quotes or backslashes to indicate you mean the character, not the regular expression operator. Note several operators have a common prefix. The lexical analyzer needs to look ahead to tell whether an = is an assignment, or is followed by another = for example. |
reserved words | the concatenation of characters; case insensitive | Reserved words are also matched by the regular expression for identifiers, so a disambiguating rule is needed. |
identifiers | no _; $ at ends of some; 2 significant letters!?; case insensitive | [a-zA-Z_][a-zA-Z0-9]* |
numbers | ints and reals, starting with [0-9]+ | 0x[0-9a-fA-F]+ etc. |
comments | REM.* | C's comments are tricky regexp's |
strings | almost ".*"; no escapes | escaped quotes |
what else? |
The C code generated by lex has the following public interface. Note the use of global variables instead of parameters, and the use of the prefix yy to distinguish scanner names from your program names. This prefix is also used in the YACC parser generator.
FILE *yyin; /* set this variable prior to calling yylex() */ int yylex(); /* call this function once for each token */ char yytext[]; /* yylex() writes the token's lexeme to an array */ /* note: with flex, I believe extern declarations must read extern char *yytext; */ int yywrap(); /* called by lex when it hits end-of-file; see below */
The .l file format consists of a mixture of lex syntax and C code fragments. The percent sign (%) is used to signify lex elements. The whole file is divided into three sections separated by %%:
header %% body %% helper functions
The header consists of C code fragments enclosed in %{ and %} as well as macro definitions consisting of a name and a regular expression denoted by that name. lex macros are invoked explicitly by enclosing the macro name in curly braces. Following are some example lex macros.
letter [a-zA-Z] digit [0-9] ident {letter}({letter}|{digit})*
The body consists of of a sequence of regular expressions for different token categories and other lexical entities. Each regular expression can have a C code fragment enclosed in curly braces that executes when that regular expression is matched. For most of the regular expressions this code fragment (also called a semantic action consists of returning an integer that identifies the token category to the rest of the compiler, particularly for use by the parser to check syntax. Some typical regular expressions and semantic actions might include:
" " { /* no-op, discard whitespace */ } {ident} { return IDENTIFIER; } "*" { return ASTERISK; } "." { return PERIOD; }You also need regular expressions for lexical errors such as unterminated character constants, or illegal characters.
The helper functions in a lex file typically compute lexical attributes,
such as the actual integer or string values denoted by literals. One
helper function you have to write is yywrap(), which is called when lex
hits end of file. If you just want lex to quit, have yywrap() return 1.
If your yywrap() switches yyin to a different file and you want lex to continue
processing, have yywrap() return 0. The lex or flex library (-ll or -lfl)
have default yywrap() function which return a 1, and flex has the directive
%option noyywrap
which allows you to skip writing this function.
([0-9]+.[0-9]* | [0-9]*.[0-9]+) ...You might almost be happier if you wrote
([0-9]*.[0-9]*) { return (strcmp(yytext,".")) ? REAL : PERIOD; }You-all know C's ternary e1 ? e2 : e3 operator, don't ya? Its an if-then-else expression, very slick.
lecture #4 began here
struct token { int category; char *text; int linenumber; int column; char *filename; union literal value; }The union literal will hold computed values of integers, real numbers, and strings. In your homework assignment, I am requiring you to compute column #'s; not all compilers require them, but they are easy. Also: in our compiler project we are not worrying about optimizing our use of memory, so am not requiring you to use a union.
It turns out the flex man page is intended to be pretty complete, enough so that we can draw our examples from it. Perhaps what you should figure out from these examples is that flex is actually... flexible. The first several examples use flex as a filter from standard input to standard output.
%% "zap me"
%% [ \t]+ putchar( ' ' ); [ \t]+$ /* ignore this token */
%% username printf( "%s", getlogin() );
int num_lines = 0, num_chars = 0; %% \n ++num_lines; ++num_chars; . ++num_chars; %% main() { yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); }
/* scanner for a toy Pascal-like language */ %{ /* need this for the call to atof() below */ #include <math.h> %} DIGIT [0-9] ID [a-z][a-z0-9]* %% {DIGIT}+ { printf( "An integer: %s (%d)\n", yytext, atoi( yytext ) ); } {DIGIT}+"."{DIGIT}* { printf( "A float: %s (%g)\n", yytext, atof( yytext ) ); } if|then|begin|end|procedure|function { printf( "A keyword: %s\n", yytext ); } {ID} printf( "An identifier: %s\n", yytext ); "+"|"-"|"*"|"/" printf( "An operator: %s\n", yytext ); "{"[^}\n]*"}" /* eat up one-line comments */ [ \t\n]+ /* eat up whitespace */ . printf( "Unrecognized character: %s\n", yytext ); %% main( argc, argv ) int argc; char **argv; { ++argv, --argc; /* skip over program name */ if ( argc > 0 ) yyin = fopen( argv[0], "r" ); else yyin = stdin; yylex(); }
COMMENT [/*][[^*/]*[*]*]]*[*/]One problem here is that square brackets are not parentheses, they do not nest, they do not support concatenation or other regular expression operators. They mean exactly: "match any one of these characters" or for ^: "match any one character that is not one of these characters". Note also that you can't use ^ as a "not" operator outside of square brackets: you can't write the expression for "stuff that isn't */" by saying (^ "*/")
lecture #5 began here
Finite Automata A finite automaton (FA) is an abstract, mathematical machine, also known as a finite state machine, with the following components:
while ((c=getchar()) != EOF) S := move(S, c);
S = {s0, s1, s2} E = {a, b, c} move = { (s0,a):s1; (s1,b):s2; (s2,c):s2 } S0 = s0 F = {s2}
Finite automata correspond in a 1:1 relationship to transition diagrams; from any transition diagram one can write down the formal automaton in terms of items #1-#5 above, and vice versa. To draw the transition diagram for a finite automaton:
state := S0 for(;;) switch (state) { case 0: switch (input) { 'a': state = 1; input = getchar(); break; 'b': input = getchar(); break; default: printf("dfa error\n"); exit(1); } case 1: switch (input) { EOF: printf("accept\n"); exit(0); default: printf("dfa error\n"); exit(1); } }
C Comments:
Fortunately, one can prove that for any NFA, there is an equivalent DFA. They are just a notational convenience. So, finite automata help us get from a set of regular expressions to a computer program that recognizes them efficiently.
multiple transitions on the same symbol handle common prefixes:
factoring may optimize the number of states. Is this picture OK/correct?
void f() { int i = 0; struct tokenlist *current, *head; ... foo(current) }Here,
current
is passed in as a parameter to foo, but it is a
pointer that hasn't been pointed at anything. I cannot tell you how many
times I personally have written bugs myself or fixed bugs in student code,
caused by reading or writing to pointers that weren't pointing at anything
in particular. Local variables that weren't initialized point at random
garbage. If you are lucky this is a coredump, but you might not be lucky,
you might not find out where the mistake was, you might just get a wrong answer.
This can all be fixed by
struct tokenlist *current = NULL, *head = NULL;
struct token *t = (struct token *)malloc(sizeof(struct token *)));This compiles, but causes coredumps during program execution. Why?
(a|c)*b(a|c)* (a|c)*|(a|c)*b(a|c)* (a|c)*(b|ε)(a|c)*
lecture #6 began here
Operations to keep track of sets of NFA states:
NFA to DFA Algorithm:
Dstates := {ε_closure(start_state)} while T := unmarked_member(Dstates) do { mark(T) for each input symbol a do { U := ε_closure(move(T,a)) if not member(Dstates, U) then insert(Dstates, U) Dtran[T,a] := U } }
...
...did you get:
OK, how about this one:
lecture #7 began here
A hash table or other efficient data structure can avoid this duplication. The software engineering design pattern to use is called the "flyweight".
%{ /* #define's for token categories LT, LE, etc. %} white [ \t\n]+ digit [0-9] id [a-zA-Z_][a-zA-Z_0-9]* num {digit}+(\.{digit}+)? %% {ws} { /* discard */ } if { return IF; } then { return THEN; } else { return ELSE; } {id} { yylval.id = install_id(); return ID; } {num} { yylval.num = install_num(); return NUMBER; } "<" { yylval.op = LT; return RELOP; } ">" { yylval.op = GT; return RELOP; } %% install_id() { /* insert yytext into the literal table */ } install_num() { /* insert (binary number corresponding to?) yytext into the literal table */ }So how would you implement a literal table using a hash table? We will see more hash tables when it comes time to construct the symbol tables with which variable names and scopes are managed, so you had better become fluent.
lecture #8 began here
let X = the start symbol s while there is some nonterminal Y in X do apply any one production rule using Y, e.g. Y -> wWhen X consists only of terminal symbols, it is a string of the language denoted by the grammar. Each iteration of the loop is a derivation step. If an iteration has several nonterminals to choose from at some point, the rules of derviation would allow any of these to be applied. In practice, parsing algorithms tend to always choose the leftmost nonterminal, or the rightmost nonterminal, resulting in strings that are leftmost derivations or rightmost derivations.
lecture #9 began here
E -> E + E E -> E * E E -> ( E ) E -> identallows two different derivations for strings such as "x + y * z". The grammar is ambiguous, but the semantics of the language dictate a particular operator precedence that should be used. One way to eliminate such ambiguity is to rewrite the grammar. For example, we can force the precedence we want by adding some nonterminals and production rules.
E -> E + T E -> T T -> T * F T -> F F -> ( E ) F -> identGiven the arithmetic expression grammar from last lecture:
How can a program figure that x + y * z is legal?
How can a program figure out that x + y (* z) is illegal?
#include <stdlib.h>includes prototypes for malloc(), free(), etc. malloc() returns a void *.
union lexval *l = (union lexval *)malloc(sizeof(union lexval));Note the stupid duplication of type information; no language is perfect! Anyhow, always cast your mallocs. The program may work without the cast, but you need to fix every warning, so you don't accidentally let a serious one through.
int F() { int t = yylex(); if (t == IDENT) return 6; else if (t == LP) { if (E() && (yylex()==RP) return 5; } return 0; }Comment #1: if F() is in the middle of a larger parse of E() or T(), F() may succeed, but the subsequent parsing may fail. The parse may have to backtrack, which would mean we'd have to be able to put tokens back for later parsing. Add a memory (say, a gigantic array or link list for example) of already-parsed tokens to the lexical analyzer, plus backtracking logic to E() or T() as needed. The call to F() may get repeated following a different production rule for a higher nonterminal.
Comment #2: in a real compiler we need more than "yes it parsed" or "no it didn't": we need a parse tree if it succeeds, and we need a useful error message if it didn't.
Question: for E() and T(), how do we know which production rule to try? Option A: just blindly try each one in turn. Option B: look at the first (current) token, only try those rules that start with that token (1 character lookahead). If you are lucky, that one character will uniquely select a production rule. If that is always true through the whole grammar, no backtracking is needed.
Question: how do we know which rules start with whatever token we are looking at? Can anyone suggest a solution, or are we stuck?
lecture #10 began here
E -> E + T | T T -> T * F | F F -> ( E ) | identWe can remove the left recursion by introducing new nonterminals and new production rules.
E -> T E' E' -> + T E' | ε T -> F T' T' -> * F T' | ε F -> ( E ) | identGetting rid of such immediate left recursion is not enough, one must get rid of indirect left recursion, where two or more nonterminals are mutually left-recursive. One can rewrite any CFG to remove left recursion (Algorithm 4.1).
for i := 1 to n do for j := 1 to i-1 do begin replace each Ai -> Aj gamma with productions Ai -> delta1gamma | delta2gamma end eliminate immediate left recursion
A : A α | βThe recursion must always terminate by A finally deriving β so you can rewrite it to the equivalent
A : &beta A' A' : &alpha A' | εExample:
E : E op T | Tcan be rewritten
E : T E' E' : op T E' | ε
A : A α1 | A α2 | ... | β1 | β2As in the trivial case, you get rid of left-recursing A and introduce an A'
A : β1 A' | β2 A' | ... A' : α1 A' | α2 A' | ... | ε
E : T + E | T - E | TThe problem here is that right recursion is forcing right associativity, but normal arithmetic requires left associativity. Several solutions are: (a) rewrite the grammar to be left recursive, or (b) rewrite the grammar with more nonterminals to force the correct precedence/associativity, or (c) if using YACC or Bison, there are "cheat codes" we will discuss later to allow it to be majorly ambiguous and specify associativity separately (look for %left and %right in YACC manuals).
S -> A B C A -> a A A -> ε B -> b C -> cmaps to pseudocode like the following. (:= is an assignment operator)
procedure S() if A() & B() & C() then succeed # matched S, we win end procedure A() if yychar == a then { # use production 2 yychar := scan() return A() } else succeed # production rule 3, match ε end procedure B() if yychar == b then { yychar := scan() succeed } else fail end procedure C() if yychar == c then { yychar := scan() succeed } else fail end
S -> cAd A -> ab A -> aLeft factoring can often solve such problems:
S -> cAd A -> a A' A'-> b A'-> (ε)One can also perform left factoring to reduce or eliminate the lookahead or backtracking needed to tell which production rule to use. If the end result has no lookahead or backtracking needed, the resulting CFG can be solved by a "predictive parser" and coded easily in a conventional language. If backtracking is needed, a recursive descent parser takes more work to implement, but is still feasible. As a more concrete example:
S -> if E then S S -> if E then S1 else S2can be factored to:
S -> if E then S S' S'-> else S2 | ε
for (i = 1; if Yi can derive ε; i++) add First(Yi+1) to First(X)
Last time we looked at an example with E, T, and F, and + and *. The first-set computation was not too exciting and we need more examples.
stmt : if-stmt | OTHER if-stmt: IF LP expr RP stmt else-part else-part: ELSE stmt | ε expr: IDENT | INTLITWhat are the First() sets of each nonterminal?
The problem can be summarized as: step through yytext, copying each piece out to sval, removing doublequotes and plusses between the pieces, and evaluating CHR$() constants.
Space allocated with malloc() can be increased in size by realloc(). realloc() is awesome. But, it COPIES and MOVES the old chunk of space you had to the new, resized chunk of space, and frees the old space, so you had better not have any other pointers pointing at that space if you realloc(), and you have to update your pointer to point at the new location realloc() returns.
i = 0; j = 0; while (yytext[i] != '\0') { if (yytext[i] == '\"') { /* copy string into sval */ i++; while (yytext[i] != '\"') { sval[j++] = yytext[i++]; } } else if ((yytext[i] == 'C') || (yytext[i] == 'c')) { /* handle CHR$(...) */ i += 5; k = atoi(yytext + i); sval[j++] = k; /* might check for 0-255 */ while (yytext[i] != ')') i++; } /* else we can just skip it */ i++; } sval[j] = '\0'; /* NUL-terminate our string */There is one more problem: how do we allocate memory for sval, and how big should it be?
sval = strdup(""); ... sval = appendstring(sval, yytext[i]); /* instead of sval[j++] = yytext[i] */where the function appendstring could be:
char *appendstring(char *s, char c) { i = strlen(s); s = realloc(s, i+2); s[i] = c; s[i+1] = '\0'; return s; }Note: it is very inefficient to grow your array one character at a time; in real life people grow arrays in large chunks at a time.
sval = malloc(strlen(yytext)+1); /* ... do the code copying into sval; be sure to NUL-terminate */ sval = realloc(sval, strlen(sval)+1);
lecture #11 began here
YACC
YACC ("yet another compiler compiler") is a popular tool which originated at
AT&T Bell Labs. YACC takes a context free grammar as input, and generates a
parser as output. Several independent, compatible implementations (AT&T
yacc, Berkeley yacc, GNU Bison) for C exist, as well as many implementations
for other popular languages.
YACC files end in .y and take the form
declarations %% grammar %% subroutinesThe declarations section defines the terminal symbols (tokens) and nonterminal symbols. The most useful declarations are:
The grammar gives the production rules, interspersed with program code fragments called semantic actions that let the programmer do what's desired when the grammar productions are reduced. They follow the syntax
A : body ;Where body is a sequence of 0 or more terminals, nonterminals, or semantic actions (code, in curly braces) separated by spaces. As a notational convenience, multiple production rules may be grouped together using the vertical bar (|).
Example. For the grammar
(1) S->aABe (2) A->Abc (3) A->b (4) B->dthe string "abbcde" can be parsed bottom-up by the following reduction steps:
abbcde aAbcde aAde aABe S
Stack Input $ w$At each step, the parser performs one of the following actions.
You can either declare that struct token may appear in the %union, and put a mixture of struct node and struct token on the value stack, or you can allocate a "leaf" tree node, and point it at your struct token. Or you can use a tree type that allows tokens to include their lexical information directly in the tree nodes. If you have more than one %union type possible, be prepared to see type conflicts and to declare the types of all your nonterminals.
Getting all this straight takes some time; you can plan on it. Your best bet is to draw pictures of how you want the trees to look, and then make the code match the pictures. No pictures == "Dr. J will ask to see your pictures and not be able to help if you can't describe your trees."
Example: in the cocogram.y that I gave you we could add a %union declaration with a union member named treenode:
%union { nodeptr treenode; }This will produce a compile error if you haven't declared a nodeptr type using a typedef, but that is another story. To declare that a nonterminal uses this union member, write something like:
%type < treenode > function_definitionTerminal symbols use %token to perform the corresponding declaration. If you had a second %union member (say struct token *tokenptr) you might write:
%token < tokenptr > SEMICOL
lecture #12 began here
S->if E then S S->if E then S else S
In many languages two nested "if" statements produce a situation where an "else" clause could legally belong to either "if". The usual rule (to shift) attaches the else to the nearest (i.e. inner) if statement.
Example reduce reduce conflict:
(1) S -> id LP plist RP (2) S -> E GETS E (3) plist -> plist, p (4) plist -> p (5) p -> id (6) E -> id LP elist RP (7) E -> id (8) elist -> elist, E (9) elist -> EBy the point the stack holds ...id LP id
T : F | F T2 ; T2 : p F T2 | ; F : l T r | v ;The reduce-reduce conflict occurs after you have seen an F. If the next symbol is a p there is no question of what to do, but if the next symbol is the end of file, do you reduce by rule #1 or #4 ?
A slightly different grammar is needed to demonstrate a shift-reduce conflict:
T : F g; T : F T2 g; T2 : t F T2 ; T2 : ; F : l T r ; F : v ;This grammar is not much different than before, and has the same problem, but the surrounding context (the "calling environments") of F cause the grammar to have a shift-reduce instead of reduce-reduce. Once again, the trouble is after you have seen an F and dwells on the question of whether to reduce the epsilon production, or instead to shift, upon seeing a token g.
The .output file generated by "bison -v" explains these conflicts in
considerable detail. Part of what you need to interpret them are the
concepts of "items" and "sets of items" discussed below.
YACC precedence and associativity declarations
YACC headers can specify precedence and associativity rules for otherwise
heavily ambiguous grammars. Precedence is determined by increasing order
of these declarations. Example:
%right ASSIGN %left PLUS MINUS %left TIMES DIVIDE %right POWER %% expr: expr ASSIGN expr | expr PLUS expr | expr MINUS expr | expr TIMES expr | expr DIVIDE expr | expr POWER expr ;
error
where errors expected
You can easily add information in your own yyerror() function, for example GCC emits messages that look like:
goof.c:1: parse error before '}' tokenusing a yyerror function that looks like
void yyerror(char *s) { fprintf(stderr, "%s:%d: %s before '%s' token\n", yyfilename, yylineno, s, yytext); }
You could instead, use the error recovery mechanism to produce better messages. For example
lbrace : LBRACE | { error_code=MISSING_LBRACE; } error ;Where LBRACE is an expected token {
Another related option is to call yyerror() explicitly with a better message string, and tell the parser to recover explicitly:
package_declaration: PACKAGE_TK error { yyerror("Missing name"); yyerrok; } ;
But, using error recovery to perform better error reporting runs against conventional wisdom that you should use error tokens very sparingly. What information from the parser determined we had an error in the first place? Can we use that information to produce a better error message?
error
token.
Even just the parse state is enough to do pretty good error messages. yystate is not part of YACC's public interface, though, so you may have to play some tricks to pass it as a parameter into yyerror() from yyparse(). Say, for example:
#define yyerror(s) __yyerror(s,yystate)Inside __yyerror(msg, yystate) you can use a switch statement or a global array to associate messages with specific parse states. But, figuring out which parse state means which syntax error message would be by trial and error.
A tool called Merr is available that let's you generate this yyerror function from examples: you supply the sample syntax errors and messages, and Merr figures out which parse state integer goes with which message. Merr also uses the yychar (current input token) to refine the diagnostics in the event that two of your example errors occur on the same parse state. See the Merr web page.
lecture #13 began here
76, 74, 74, 74, 73, 72, 66, 65, 55, 52, 46, 35, 30, 30, 30, 15, 141/3rd of the class got an "A". The rest of you need to visit the TA, see how the grades were measured, see the professor, and most important, get a lexical analyzer working well enough to complete the later assignments in this course. If your grade was below 70, you probably want to get it working and resubmit it, I have asked the TA to accept resubmissions and average the grades (example: you got a 30, fixed it and resubmitted it and got a 70; your overall grade is a 50). This option is valid until the due date for the next homework.
The LR parsing algorithm is given below.
ip = first symbol of input repeat { s = state on top of parse stack a = *ip case action[s,a] of { SHIFT s': { push(a); push(s') } REDUCE A->beta: { pop 2*|beta| symbols; s' = new state on top push A push goto(s', A) } ACCEPT: return 0 /* success */ ERROR: { error("syntax error", s, a); halt } } }
Definition: An LR(0) item of a grammar G is a production of G with a dot at some position of the RHS.
Example: The production A->aAb gives the items:
A -> . a A b
A -> a . A b
A -> a A . b
A -> a A b .
Note: A production A-> ε generates only one item:
A -> .
Intuition: an item A-> α . β denotes:
Closure: if I is a set of items for a grammar G, then closure(I) is the set of items constructed as follows:
These two rules are applied repeatedly until no new items can be added.
Intuition: If A -> α . B β is in
closure(I) then we hope to see a string derivable from B in the
input. So if B-> γ is a production,
we should hope to see a string derivable from γ.
Hence, B->.γ is in closure(I).
Goto: if I is a set of items and X is a grammar symbol, then goto(I,X) is defined to be:
goto(I,X) = closure({[A->αX.β] | [A->α.Xβ] is in I})
Intuition:
E -> E+T | T T -> T*F | F F -> (E) | idLet I = {[E -> E . + T]} then:
goto(I,+) = closure({[E -> E+.T]}) = closure({[E -> E+.T], [E -> .T*F], [T -> .F]}) = closure({[E -> E+.T], [E -> .T*F], [T -> .F], [F-> .(E)], [F -> .id]}) = { [E -> E + .T],[T -> .T * F],[T -> .F],[F -> .(E)],[F -> .id]}
begin C := { closure({[S' -> .S]}) }; repeat for each set of items I in C: for each grammar symbol X: if goto(I,X) != 0 and goto(I,X) is not in C then add goto(I,X) to C; until no new sets of items can be added to C; return C; end
Valid Items: an item A -> β 1. β 2 is valid for a viable prefix α β 1 if there is a derivation:
S' =>*rm αAω =>*rmα β1β 2ω
Suppose A -> β1.β 2 is valid for αβ1, and αB1 is on the parsing stack
Note: two valid items may tell us to do different things for the same viable prefix. Some of these conflicts can be resolved using lookahead on the input string.
Example:
S -> aABe FIRST(S) = {a} FOLLOW(S) = {$} A -> Abc FIRST{A} = {b} FOLLOW(A) = {b,d} A -> b FIRST{B} = {d} FOLLOW{B} = {e} B -> d FIRST{S'}= {a} FOLLOW{S'}= {$} I0 = closure([S'->.S] = closure([S'->.S],[S->.aABe]) goto(I0,S) = closure([S'->S.]) = I1 goto(I0,a) = closure([S->a.Abe]) = closure([S->a.Abe],[A->.Abc],[A->.b]) = I2 goto(I2,A) = closure([S->aA.Be],[A->A.bc]) = closure([S->aA.Be],[A->A.bc],[B->.d]) = I3 goto(I2,B) = closure([A->b.]) = I4 goto(I3,B) = closure([S->aAB.e]) = I5 goto(I3,b) = closure([A->Ab.c]) = I6 goto(I3,d) = closure([B->d.]) = I7 goto(I5,e) = closure([S->aABe.]) = I8 goto(I6,c) = closure([A->Abc.]) = I9
lecture #14 began here
Parse trees are k-ary, where there is a variable number of children bounded by a value k determined by the grammar. You may wish to consult your old data structures book, or look at some books from the library, to learn more about trees if you are not totally comfortable with them.
#include <stdarg.h> struct tree { short label; /* what production rule this came from */ short nkids; /* how many children it really has */ struct tree *child[1]; /* array of children, size varies 0..k */ }; struct tree *alctree(int label, int nkids, ...) { int i; va_list ap; struct tree *ptr = malloc(sizeof(struct tree) + (nkids-1)*sizeof(struct tree *)); if (ptr == NULL) {fprintf(stderr, "alctree out of memory\n"); exit(1); } ptr->label = label; ptr->nkids = nkids; va_start(ap, nkids); for(i=0; i < nkids; i++) ptr->child[i] = va_arg(ap, struct tree *); va_end(ap); return ptr; }
Besides a function to allocate trees, you need to write one or more recursive functions to visit each node in the tree, either top to bottom (preorder), or bottom to top (postorder). You might do many different traversals on the tree in order to write a whole compiler: check types, generate machine- independent intermediate code, analyze the code to make it shorter, etc. You can write 4 or more different traversal functions, or you can write 1 traversal function that does different work at each node, determined by passing in a function pointer, to be called for each node.
void postorder(struct tree *t, void (*f)(struct tree *)) { /* postorder means visit each child, then do work at the parent */ int i; if (t == NULL) return; /* visit each child */ for (i=0; i < t-> nkids; i++) postorder(t->child[i], f); /* do work at parent */ f(t); }You would then be free to write as many little helper functions as you want, for different tree traversals, for example:
void printer(struct tree *t) { if (t == NULL) return; printf("%p: %d, %d children\n", t, t->label, t->nkids); }
What we have at the start of semantic analysis is a syntax tree that corresponds to the source program as parsed using the context free grammar. Semantic information is added by annotating grammar symbols with semantic attributes, which are defined by semantic rules. A semantic rule is a specification of how to calculate a semantic attribute that is to be added to the parse tree.
So the input is a syntax tree...and the output is the same tree, only "fatter" in the sense that nodes carry more information. Another output of semantic analysis are error messages detecting many types of semantic errors.
Two typical examples of semantic analysis include:
Notations used in semantic analysis:
In practice, attributes get stored in parse tree nodes, and the semantic rules are evaluated either (a) during parsing (for easy rules) or (b) during one or more (sub)tree traversals.
CFG | Semantic Rule | |
---|---|---|
E1 : E2 + T |
E1.isconst = E2.isconst && T.isconst if (E1.isconst) E1.value = E2.value + T.value | |
E : T |
E.isconst = T.isconst if (E.isconst) E.value = T.value | |
T : T * F |
T1.isconst = T2.isconst && F.isconst if (T1.isconst) T1.value = T2.value * F.value | |
T : F |
T.isconst = F.isconst if (T.isconst) T.value = F.value | |
F : ( E ) |
F.isconst = E.isconst if (F.isconst) F.value = E.value | |
F : ident |
F.isconst = FALSE | |
F : intlit |
F.isconst = TRUE F.value = intlit.ival |
lecture #15 began here
Options:
Perhaps the best approach to all this is to unify the tokens and parse tree nodes with something like the following, where perhaps an nkids value of -1 is treated as a flag that tells the reader to use lexical information instead of pointers to children:
struct node { int code; /* terminal or nonterminal symbol */ int nkids; union { struct token { ... } leaf; struct node *kids[9]; }u; } ;There are actually nonterminal symbols with 0 children (nonterminal with a righthand side with 0 symbols) so you don't necessarily want to use an nkids of 0 is your flag to say that you are a leaf.
lecture #16 began here
(d+pd*|d*pd+)(ed+)?
E : E + T | T T : T * F | F F : ( E ) | ident
A: questions that allow you to demonstrate that you know the difference between an DFA and an NFA, questions about lex and flex and tokens and lexical attributes, questions about context free grammars: ambiguity, factoring, removing left recursion, etc.
typedef int foo; foo x; /* a normal use of typedef... */ foo foo; /* try this on gcc! is it a legal global? */ void main() { foo foo; } /* what about this ? */
370-C does not support typedef's and without working typedef's the TYPE_NAME token simply will never occur. Typedef's are fair game for extra credit points.
struct c_type { int base_type; /* 1 = int, 2=float, ... */ union { struct array { int size; struct c_type *elemtype; } a; struct ctype *p; struct struc { char *label; struct field **f; } s; } u; } struct field { char *name; struct ctype *elemtype; }Given this representation, how would you initialize a variable to represent each of the following types:
int [10][20] struct foo { int x; char *s; }
grammar rule | semantic rule |
---|---|
E1 : E2 PLUS E3 | E1.type = check_types(PLUS, E2.type, E3.type) |
int x; long y; y = y + x;
For records/structures, some languages use name equivalence, while others use structure equivalence. Features like typedef complicate matters. If you have a new type name MY_INT that is defined to be an int, is it compatible to pass as a parameter to a function that expects regular int's? Object-oriented languages also get interesting during type checking, since subclasses usually are allowed anyplace their superclass would be allowed.
lecture #17 began here
lecture #18 began here
Scope rules for each language determine how to go from names to declarations.
Each use of a variable name must be associated with a declaration. This is generally done via a symbol table. In most compiled languages it happens at compile time (in contrast, for example ,with LISP).
code |
---|
static data |
stack (grows down) |
heap (may grow up, from bottom of address space) |
return value | |
parameter | |
... | |
parameter | |
previous frame pointer (FP) | |
saved registers | |
... | |
FP--> | saved PC |
local | |
... | |
local | |
temporaries | |
SP--> | ... |
The Basic problem in garbage collection: given a piece of memory, are there any pointers to it? (And if so, where exactly are all of them please). Approaches:
To work on segmentation faults: recompile all .c files with -g and run your program inside gdb to the point of the segmentation fault. Type the gdb "where" command. Print the values of variables on the line mentioned in the debugger as the point of failure. If it is inside a C library function, use the "up" command until you are back in your own code, and then print the values of all variables mentioned on that line.
There is one more tool you should know about, which is useful for certain kinds of bugs, primarily subtle memory violations. It is called electric fence. To use electric fence you add
/home/uni1/jeffery/ef/ElectricFence-2.1/libefence.ato the line in your makefile that links your object files together to form an executable.
lecture #19 began here
Can be formulated as syntax-directed translation
newtemp()
newlabel()
Production | Semantic Rules |
---|---|
S -> id ASN E | S.code = E.code || gen(ASN, id.place, E.place) |
E -> E1 PLUS E2 | E.place = newtemp(); E.code = E1.code || E2.code || gen(PLUS,E.place,E1.place,E2.place); |
E -> E1 MUL E2 | E.place = newtemp(); E.code = E1.code || E2.code || gen(MUL,E.place,E1.place,E2.place); |
E -> MINUS E1 | E.place = newtemp(); E.code = E1.code || gen(NEG,E.place,E1.place); |
E -> LP E1 RP | E.place = E1.place; E.code = E1.code; |
E -> IDENT | E.place = id.place; E.code = emptylist(); |
Instruction set:
mnemonic | C equivalent | description |
---|---|---|
ADD, SUB,MUL,DIV | x := y op z | store result of binary operation on y and z to x |
NEG | x := op y | store result of unary operation on y to x |
ASN | x := y | store y to x |
ADDR | x := &y | store address of y to x |
LCONT | x := *y | store contents pointed to by y to x |
SCONT | *x := y | store y to location pointed to by x |
GOTO | goto L | unconditional jump to L |
BLESS,... | if x rop y then goto L | binary conditional jump to L |
BIF | if x then goto L | unary conditional jump to L |
BNIF | if !x then goto L | unary negative conditional jump to L |
PARM | param x | store x as a parameter |
CALL | call p,n,x | call procedure p with n parameters, store result in x |
RET | return x | return from procedure, use x as the result |
Declarations (Pseudo instructions): These declarations list size units as "bytes"; in a uniform-size environment offsets and counts could be given in units of "slots", where a slot (4 bytes on 32-bit machines) holds anything.
global x,n1,n2 | declare a global named x at offset n1 having n2 bytes of space |
---|---|
proc x,n1,n2 | declare a procedure named x with n1 bytes of parameter space and n2 bytes of local variable space |
local x,n | declare a local named x at offset n from the procedure frame |
label Ln | designate that label Ln refers to the next instruction |
end | declare the end of the current procedure |
x := y field z | lookup field named z within y, store address to x |
---|---|
class x,n1,n2 | declare a class named x with n1 bytes of class variables and n2 bytes of class method pointers |
field x,n | declare a field named x at offset n in the class frame |
new x | create a new instance of class name x |
You do this sizing up once for each scope. The size of each scope is the sum of the sizes of symbols in its symbol table.
struct descrip { short type; short size; union { char *string; int ival; float rval; struct descrip *array; /* ... for other types */ } value; };
What are the basic blocks in the following 3-address code? ("read" is a 3-address code to read in an integer.)
read x t1 = x > 0 if t1 == 0 goto L1 fact = 1 label L2 t2 = fact * x fact = t2 t3 = x - 1 x = t3 t4 = x == 0 if t4 == 0 goto L2 t5 = addr const:0 param t5 ; "%d\n" param fact call p,2 label L1 haltBasic blocks are often used in order to talk about specific types of optimizations that rely on basic blocks. So if they are used for optimization, why did I introduce basic blocks? You can view every basic block as a hamburger; it will be a lot easier to eat if you sandwich it inside a pair of labels (first and follow)!
Depending on your source language's semantic rules for things like "short-circuit" evaluation for boolean operators, the operators like || and && might be similar to + and * (non-short-circuit) or they might be more like if-then code.
A general technique for implementing control flow code is to add new attributes to tree nodes to hold labels that denote the possible targets of jumps. The labels in question are sort of analogous to FIRST and FOLLOW; for any given list of instructions corresponding to a given tree node, we might want a .first attribute to hold the label for the beginning of the list, and a .follow attribute to hold the label for the next instruction that comes after the list of instructions. The .first attribute can be easily synthesized. The .follow attribute must be inherited from a sibling. The labels have to actually be allocated and attached to instructions at appropriate nodes in the tree corresponding to grammar production rules that govern control flow. An instruction in the middle of a basic block need neither a first nor a follow.
C code | Attribute Manipulations |
---|---|
S->if E then S1 | E.true = newlabel(); E.false = S.follow; S1.follow = S.follow; S.code = E.code || gen(LABEL, E.true)|| S1.code |
S->if E then S1 else S2 | E.true = newlabel(); E.false = newlabel(); S1.follow = S.follow; S2.follow = S.follow; S.code = E.code || gen(LABEL, E.true)|| S1.code || gen(GOTO, S.follow) || gen(LABEL, E.false) || S2.code |
lecture #20 began here
LANL is seeking outstanding SOPHOMORE, JUNIOR AND NON- GRADUATING SENIOR LEVEL Computer Science majors to work in the areas of networking, desktop support, high performance computing or software engineering. Positions are available for the fall 2006 semester. MUST HAVE A GPA OF 3.0 OR HIGHER.
To request a referral go to www.nmsu.edu/pment, click on "Co- op Job Listings", Job #86 or call the co-op office at 646- 4115. LANL is requiring a cover letter to also be sent, please send that via email at coop@nmsu.edu in the subject line put attn: LANL cover letter.
Co-op Office 646-4115
Implementation techniques for these alternatives include:
a<b || c<d && e<ftranslates into
100: if a<b goto 103 t1 = 0 goto 104 103: t1 = 1 104: if c<d goto 107 t2 = 0 goto 108 107: t2 = 1 108: if e<f goto 111 t3 = 0 goto 112 111: t3 = 1 112: t4 = t2 AND t3 t5 = t1 OR t4
a<b || c<d && e<ftranslates into
if a<b goto L1 if c<d goto L2 goto L3 L2: if e<f goto L1 L3: t = 0 goto L4 L1: t = 1 L4: ...Note: L3 might instead be the target E.false; L1 might instead be E.true; no computation of a 0 or 1 into t might be needed at all.
C code | Attribute Manipulations |
---|---|
S->while E do S1 | E.true = newlabel(); E.false = S.follow; S1.follow = E.first; S.code = gen(LABEL, E.first) || E.code || gen(LABEL, E.true)|| S1.code || gen(GOTO, E.first) |
lecture #21 began here
void main() { int i; i = 0; while (i < 20) i = i * i + 1; print(i); }This code has the following syntax tree
i = 0; if (i >= 20) goto L50; i = i * i + 1; goto 20; print(i);This program corresponds to the following syntax tree, which a successful homework #5 would build. Note that it has a height of approximately 10, and a maximum arity of approximately 4. Also: your exact tree might have more nodes, or slightly fewer; as long as the information and general shape is there, such variations are not a problem.
A syntax tree, with attributes obtained from lexical and semantic analysis, needs to be shown here. During semantic analysis, it is discovered that "print" has not been defined, so let it be:
void print(int i) { }
The code for the boolean conditional expression controlling the while loop is a list of length 1, containing the instruction t0 = i < 20, or more formally
opcode | dest | src1 | src2 |
---|---|---|---|
LT | t0 | i | 20 |
The actual C representation of addresses dest, src1, and src2 is probably as a
region offset |
opcode | dest | src1 | src2 |
---|---|---|---|
LT | local t0.offset | local i.offset | const 20 |
Regions are expressed with a simple integer encoding like: global=1, local=2, const=3. Note that address values in all regions are offsets from the start of the region, except for region "const", which stores the actual value of a single integer as its offset.
opcode | dest | src1 | src2 |
---|---|---|---|
MUL | local t1.offset | local i.offset | local i.offset |
lecture #22 began here
void codegen(nodeptr t) { int i, j; if (t==NULL) return; /* * this is a post-order traversal, so visit children first */ for(i=0;i<t->nkids;i++) codegen(t->child[i]); /* * back from children, consider what we have to do with * this node. The main thing we have to do, one way or * another, is assign t->code */ switch (t->label) { case PLUS: { t->code = concat(t->child[0].code, t->child[1].code); g = gen(PLUS, t->address, t->child[0].address, t->child[1].address); t->code = concat(t->code, g); break; } /* ... really, we need a bazillion cases, perhaps one for each * production rule (in the worst case) */ default: /* default is: concatenate our children's code */ t->code = NULL; for(i=0;i<t->nkids;i++) t->code = concat(t->code, t->child[i].code); } }
Zero operators.
if (x) Stranslates into
if x != 0 goto L1 goto L2 label L1 ...code for S label L2or if you are being fancy
if x == b goto L1 ...code for S label L1I may do this without comment in later examples, to keep them short.
One relational operator.
if (a < b) Stranslates into
if i >= b goto L1 ...code for S label L1One boolean operator.
if (a < b && c > d) Stranslates into
if (a < b) if (c > d) ...code for Swhich if we expand it
if i >= b goto L1 if c <= d goto L2 ...code for S label L2 label L1by mechanical means, we may wind up with lots of labels for the same target, this is OK.
if (a < b || c > d) Stranslates into
if (a < b) ...code for S if (c > d) ...code for Sbut its unacceptable to duplicate the code for S! It might be huge! Generate labels for boolean-true-yes-we-do-this-thing, not just for boolean-false-we-skip-this-thing.
if a < b goto L1 if c > d goto L2 goto L3 label L2 label L1 ...code for S label L3
now, what about arrays? reading an array value: x = a[i]. Draw the picture. Consider the machine uses byte-addressing, not word-addressing.
t0 := addr a t1 := i * 4 t2 := plus t0 t1 t3 := deref t2 x := t3What about writing an array value?
lecture #23 began here
Remind me to come back to HW #6 before the end of
today's lecture.
Final Code
Goal: execute the program we have been translating, somehow.
Alternatives:
In the Old Days, there were Load-Store hardware architectures in which only one (accumulator) register was present. On such an architecture, register allocation and assignment is not needed; the compiler has few options about how it uses the accumulator register. Traditional x86 16-bit architecture was only a little better than a load-store architecture, with 4 registers instead of 1. At the other extreme, Recent History has included CPU's with 32 or more general purpose registers. On such systems, high quality compiler register allocation and assignment makes a huge difference in program execution speed. Unfortunately, optimal register allocation and assignment is NP-complete, so compilers must settle for doing a "good" job.
AssignStmt : Var EQU Expr.We might extend the C semantic action for that rule with extra code after building our parse tree node:
AssignStmt : Var EQU Expr { $$ = alctree(..., $1, $2, $3); lvalue($1); rvalue($3); }lvalue() and rvalue() are mini-tree traversals for the lefthand side and righthand side of an assignment statement. Their missions are to propagate information from the parent, namely, inherited attributes that tell nodes whether their values are being assigned to (initialized) or being read from.
void lvalue(struct tree *t) { if (t->label == IDENT) { struct symtabentry *ste = lookup(t->u.token.name); ste->lvalue = 1; } for (i=0; inkids; i++) { lvalue(t->child[i]); } } void rvalue(struct tree *t) { if (t->label == IDENT) { struct symtabentry *ste = lookup(t->u.token.name); if (ste->lvalue == 0) warn("possible use before assignment"); } for (i=0; i nkids; i++) { lvalue(t->child[i]); } }
lecture #24 began here
So you need a runtime system; potentially, this might be as big or bigger a job than writing the compiler. Languages vary from assembler (no runtime system) and C (small runtime system, mostly C with some assembler) on up to Java (large runtime system, mostly Java with some C) and in even higher level languages the compiler may evaporate and the runtime system become gigantic. The Unicon language has a relatively trivial compiler and gigantic virtual machine and runtime system. Other scripting languages might have no compiler at all, doing everything (even lexing and parsing) in the runtime system.
For your project: whether you generate C or X86 or Java, you'll need a plan for what to do about a runtime system. And, in principle, I am not opposed to helping with this part. But the compiler and runtime system have to fit together; if I write part of the BASIC runtime system for you, or we write it together, we have to agree on things such as: what the types of parameters and return values must look like.
So, what belongs in a Color BASIC runtime system? Anything not covered by a three address instruction. Looking at cocogram.y:
What would a runtime system function look like? It would take in and pass out BASIC values, represented as C structs. You would then link this code in to your generated C or assembler code (if you generated Java code, you would have to deal with the Java Native Interface or else write these functions in Java).
void PRINT(struct descrip *d) { switch (d->type) { case INTEGER: printf("%d",d->value.ival); break; case REAL: printf("%f",d->value.rval); break; case STRING: printf("%*s",d->size,d->value.string); break; case ARRAY: printf("cannot print arrays"); break; /* can't get here */ default: printf("PRINT: internal error, type %d\n", d->type); } }Now, let's look at the "whole" runtime system:
What do variables A, A(), A$, and A$() look like in memory? How does our runtime system make it so?
Let's take a look at DIM, in libc.c. This DIM is for arrays of numbers. How would you handle arrays of strings?
Can you implement STRCAT for your BASIC runtime system?
What other BASIC statements, operators, or functions allocate memory?
How would we avoid memory "leaks"?
If you use the "call" (3-address) instruction to do GOSUB, your native code will have to make a clear distinction between BASIC call's and calls to runtime system (built-in) functions. Perhaps it is best to implement BASIC GOSUB by pushing a "param" (the next instruction following the GOSUB) and a "goto". The BASIC RETURN is then a "pop" followed by a "goto". What, we don't have a "pop" 3-address instruction? We do now... the name of "param" should probably be "push" anyhow.
Come to think of it, we've been talking about doing a call to a built-in function such as PRINT, but that PRINT function we wrote is C code; it doesn't do a 3-address "ret" instruction, hmmm. How are we going to generate the native code for the 3-address "call" instruction? It may include an assembler call instruction, but it may also involve instructions to handle the interface between BASIC and C.
lecture #25 began here
Even if an instruction set does support memory-based operations, most compilers will want to load load a value into a register while it is being used, and then spill it back out to main memory when the register is needed for another purpose. The task of minimizing memory accesses becomes the task of minimizing register loads and spills.
a = a+b+c+d+e+f+g+a+c+e;Our naive three address code generator would generate a lot of temporary variables here, when really one big number is being added. How many registers does the expression need? Some variables are referenced once, some twice. GCC generates:
movl b, %eax addl a, %eax addl c, %eax addl d, %eax addl e, %eax addl f, %eax addl g, %eax addl a, %eax addl c, %eax addl e, %eax movl %eax, aNow consider
a = (a+b)*(c+d)*(e+f)*(g+a)*(c+e);How many registers are needed here?
movl b, %eax movl a, %edx addl %eax, %edx movl d, %eax addl c, %eax imull %eax, %edx movl f, %eax addl e, %eax imull %eax, %edx movl a, %eax addl g, %eax imull %eax, %edx movl e, %eax addl c, %eax imull %edx, %eax movl %eax, aAnd now this:
a = ((a+b)*(c+d))+((e+f)*(g+a))+(c*e);which compiles to
movl b, %eax movl a, %edx addl %eax, %edx movl d, %eax addl c, %eax movl %edx, %ecx imull %eax, %ecx movl f, %eax movl e, %edx addl %eax, %edx movl a, %eax addl g, %eax imull %edx, %eax leal (%eax,%ecx), %edx movl c, %eax imull e, %eax leal (%eax,%edx), %eax movl %eax, aLastly (for now) consider:
a = ((a+b)*(c+d))+(((e+f)*(g+a))/(c*e));The division instruction adds new wrinkles. It operates on an implicit register accumulator which is twice as many bits as the number you divide by, meaning 64 bits (two registers) to divide by a 32-bit number. Note in this code that gcc would rather spill than use %ebx. %ebx is either being used implicitly or is reserved by the compiler for some (probably good) reason. %edi and %esi are similarly ignored.
movl b, %eax movl a, %edx addl %eax, %edx movl d, %eax addl c, %eax movl %edx, %ecx imull %eax, %ecx movl f, %eax movl e, %edx addl %eax, %edx movl a, %eax addl g, %eax imull %eax, %edx movl c, %eax imull e, %eax movl %eax, -4(%ebp) movl %edx, %eax cltd idivl -4(%ebp) movl %eax, -4(%ebp) movl -4(%ebp), %edx leal (%edx,%ecx), %eax movl %eax, a
iload_1 iload_2 iadd iload_3 iload 4 iadd imul iload 5 iload 6 iadd iload 7 iload_1 iadd imul iload_3 iload 5 imul idiv iadd istore_1
lecture #26 began here
name | sample | optimized as |
---|---|---|
redundant load or store |
MOVE R0,a MOVE a,R0 |
MOVE R0,a |
dead code |
#define debug 0 ... if (debug) printf("ugh"); | |
control flow simplification |
if a < b goto L1 ... L1: goto L2 |
if a < b goto L2 ... L1: goto L2 |
algebraic simplification |
x = x * 1; | |
strength reduction |
x = y * 16; |
x = y << 4; |
x = 7; ... y = x+5;
Explicit:
(a+b)*i + (a+b)/j;The (a+b) is a common subexpression that you should not have to compute twice.
Implicit:
x = a[i]; a[i] = a[j]; a[j] = x;Every array subscript requires an addition operation to compute the memory address; but do we have to compute the location for a[i] and a[j] twice in this code?
for(i=0; i<3; i++) { x += i * i; y += x * x; } |
x += 0 * 0; y += x * x; x += 1 * 1; y += x * x; x += 2 * 2; y += x * x; |
y += x * x; x += 1; y += x * x; x += 4; y += x * x; |
for (i=0; i<strlen(s); i++) s[i] = tolower(s[i]); |
t_0 = strlen(s); for (i=0; i<t_0; i++) s[i] = tolower(s[i]); |
f(x,r,s,1); int f(int x, float y, char *z, int n) { switch (n) { case 1: do_A; break; case 2: do_B; break; ... } } |
f_1(x,r,s); int f_1(int x, float y, char *z) { do_A; } int f_2(int x, float y, char *z) { do_B; } ... |