code generation

This document provides a guide to the generation of target code from intermediate three-address instructions. Perhaps the methodology for creating it is more important than its unfinished contents, albeit very simple. This guide is produced by reverse-engineering, that is to say, by examining the output of "gcc -S".

     intermediate
code
instruction                x86_64
equivalent Comment

int x; (global) .comm x,4,4 name,size,alignment

x := y + z
(C global variables) movl y(%rip), %edx movl z(%rip), %eax leal (%rdx,%rax), %eax movl %eax, x(%rip) The register %rip, which is not mentioned in Bryant/O'Halloran Figure 2, is the instruction pointer, a.k.a program counter, 64-bit edition.

x := y + z
(local variables) movl -4(%rbp), %eax movl -8(%rbp), %edx leal (%rdx,%rax), %eax movl %eax, -12(%rbp)

x := y + z
(class foo variables) movq %rdi, -8(%rbp) ; t1 = self movq -8(%rbp), %rax ; rax = self movq 8(%rax), %rdx ; rdx = self->y movq -8(%rbp), %rax ; rax = self movq 16(%rax), %rax ; rax = self->z addq %rax, %rdx ; rdx = y+z movq -8(%rbp), %rax ; rax = self movq %rdx, (%rax) ; self->x = rdx
optimizes (-O2) to
movq 16(%rdi), %rax addq 8(%rdi), %rax movq %rax, (%rdi)
Note main issue of memory layout for fields x,y,z at offsets 0,8,16; these are known at compile-time for static/non-virtual OOP. A dynamic/virtual would treat as
self.x = self.y + self.z
and implement field op via runtime call
or table lookup.

x := y / z
(C global variables) movl y(%rip), %eax movl %eax, %edx sarl $31, %edx idivl z(%rip) movl %eax, x(%rip) Sarl, the shift-right, seems to fill dx out with the sign bit of ax. idivl seems to use a 64-bit numerator and divide it by a 32-bit denominator.

x := - y
(local variables) movl -4(%rbp), %eax negl %eax movl %eax, -8(%rbp)

x := y
(local variables) movl -4(%rbp), %eax movl %eax, -8(%rbp) Note: mov does not do direct memory-to-memory copy

x := &y (y global) movq $y, -8(%rbp) Note: $y apparently gives absolute address of y, mov instruction apparently will store this to a memory address in register-relative form

x := &y (y local) leaq -12(%rbp), %rax movq %rax, -8(%rbp) Load effective address. Instead of fetching contents of -12(%rbp).

x := *y

*x := y

goto L jmp L

if x < y then goto L movl x, %rax cmpq y, %rax jle L Full set of "condition code bits" in the condition registers, for the various comparison operators.

if x then goto L cmpq $0, -8(%rbp) jne L

if !x then goto L         cmpq $0, -8(%rbp)         jne L'         jmp L L': Why not:
cmpq $0, -8(%rbp) je L

param x movq -8(%rbp), reg Calculate what parameter # you are by counting how many instructions in the linked list until you get to the CALL instruction. Params 1-6 are passed in registers. Others on the stack.
param # 1 2 3 4 5 6
%rdi %rsi %rdx %rcx %r8 %r9

call p,n,x If call is to a member function, I hope you remembered to insert/push "self" object as first parameter for method invocation
return x movl -8(%bsp), %eax jmp Lend Load return value to into ax register, then jump to end to return
global x,n1,n2 treat globals as class variables of some "global" singleton?
proc x,n1,n2      .text      .p2align 4,,15 .globl f      .type f, @function f: .LFBn:      .cfi_startproc ... .LFBn
local x,n
label Ln
end Lend:         leave         .cfi_def_cfa 7, 8         ret         .cfi_endproc .LFEn:         .size func, .-func Counter for .LFEn incremented for each function in file
x := y field z may involve y's class
class x,n1,n2
field x,n