code generation

This document provides a guide to the generation of target code from intermediate three-address instructions. Perhaps the methodology for creating it is more important than its unfinished contents, albeit very simple. This guide is produced by reverse-engineering, that is to say, by examining the output of "gcc -S".

     intermediate     
code
instruction
               x86_64               
equivalent
Comment
int x; (global) .comm x,4,4 name,size,alignment
x := y + z
(C global variables)
movl y(%rip), %edx
movl z(%rip), %eax
leal (%rdx,%rax), %eax
movl %eax, x(%rip)
The register %rip, which is not mentioned in Bryant/O'Halloran Figure 2, is the instruction pointer, a.k.a program counter, 64-bit edition.
x := y + z
(local variables)
movl -4(%rbp), %eax
movl -8(%rbp), %edx
leal (%rdx,%rax), %eax
movl %eax, -12(%rbp)
x := y + z
(class foo variables)
movq %rdi, -8(%rbp) ; t1 = self
movq -8(%rbp), %rax ; rax = self
movq 8(%rax), %rdx ; rdx = self->y
movq -8(%rbp), %rax ; rax = self
movq 16(%rax), %rax ; rax = self->z
addq %rax, %rdx ; rdx = y+z
movq -8(%rbp), %rax ; rax = self
movq %rdx, (%rax) ; self->x = rdx

optimizes (-O2) to

movq 16(%rdi), %rax
addq 8(%rdi), %rax
movq %rax, (%rdi)

Note main issue of memory layout for fields x,y,z at offsets 0,8,16; these are known at compile-time for static/non-virtual OOP. A dynamic/virtual would treat as
self.x = self.y + self.z
and implement field op via runtime call
or table lookup.
x := y / z
(C global variables)
movl y(%rip), %eax
movl %eax, %edx
sarl $31, %edx
idivl z(%rip)
movl %eax, x(%rip)
Sarl, the shift-right, seems to fill dx out with the sign bit of ax. idivl seems to use a 64-bit numerator and divide it by a 32-bit denominator.
x := - y
(local variables)
movl -4(%rbp), %eax
negl %eax
movl %eax, -8(%rbp)
x := y
(local variables)
movl -4(%rbp), %eax
movl %eax, -8(%rbp)
Note: mov does not do direct memory-to-memory copy
x := &y (y global) movq $y, -8(%rbp) Note: $y apparently gives absolute address of y, mov instruction apparently will store this to a memory address in register-relative form
x := &y (y local) leaq -12(%rbp), %rax
movq %rax, -8(%rbp)
Load effective address. Instead of fetching contents of -12(%rbp).
x := *y
*x := y
goto L jmp L
if x < y then goto L movl x, %rax
cmpq y, %rax
jle L
Full set of "condition code bits" in the condition registers, for the various comparison operators.
if x then goto L cmpq $0, -8(%rbp)
jne L
if !x then goto L         cmpq $0, -8(%rbp)
        jne L'
        jmp L
L':
Why not:
cmpq $0, -8(%rbp)
je L
param x movq -8(%rbp), reg Calculate what parameter # you are by counting how many instructions in the linked list until you get to the CALL instruction. Params 1-6 are passed in registers. Others on the stack.
param #123456
%rdi%rsi%rdx%rcx%r8%r9
call p,n,x If call is to a member function, I hope you remembered to insert/push "self" object as first parameter for method invocation
return x movl -8(%bsp), %eax
jmp Lend
Load return value to into ax register, then jump to end to return
global x,n1,n2 treat globals as class variables of some "global" singleton?
proc x,n1,n2      .text
     .p2align 4,,15
.globl f
     .type f, @function
f:
.LFBn:
     .cfi_startproc
...
.LFBn
local x,n
label Ln
end Lend:
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFEn:
        .size func, .-func
Counter for .LFEn incremented for each function in file
x := y field z may involve y's class
class x,n1,n2
field x,n