Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Intermediate Code Generator, Slides of Compiler Design

Intermediate code generation and three-address code. It covers topics such as variants of syntax trees, types and declarations, translation of expressions, control flow, and backpatching. The document also explains the value-number method for constructing DAGs, forms of three address instructions, and data structures for three address codes. It provides examples of translations to three address code and explains how to address array elements. Additionally, the document covers the translation of Boolean expressions and discusses short-circuit evaluation and numerical and positional encoding.

Typology: Slides

2021/2022

Available from 09/23/2023

avishek-1
avishek-1 ๐Ÿ‡ฎ๐Ÿ‡ณ

4 documents

1 / 47

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 6
Intermediate Code Generation
Unit 3 (contd..)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f

Partial preview of the text

Download Intermediate Code Generator and more Slides Compiler Design in PDF only on Docsity!

Chapter 6

Intermediate Code Generation

Unit 3 (contd..)

Outline

Variants of Syntax Trees

Three-address code

Types and declarations

Translation of expressions

Control flow

Backpatching

Variants of syntax trees

It is sometimes beneficial to crate a DAG instead of tree for

Expressions.

This way we can easily show the common sub-expressions and

then use that knowledge during code generation

Example: a+a(b-c)+(b-c)d

The common subexpression b-c has two parents , a*(b-c) and (b-

c)*d

Next a+a*(b-c) is

evaluated

b

c

a

d

SDD for creating DAGโ€™s

  1. E -> E1+T

  2. E -> E1-T

  3. E-> E1*T

  4. E -> T

  5. T -> (E)

  6. T -> id

  7. T -> num

Production Semantic Rules

E.node= new Node(โ€˜+โ€™, E1.node,T.node)

E.node= new Node(โ€˜-โ€™, E1.node,T.node)

E.node=new Node(โ€˜*โ€™, E1.node, T.node)

E.node = T.node

T.node = E.node

T.node = new Leaf(id, id.entry)

T.node = new Leaf(num, num.val)

Steps for constructing the DAG:

1)p1=Leaf(id, entry-a)

2)p2=Leaf(id, entry-a)=p

3)p3=Leaf(id, entry-b)

4)p4=Leaf(id, entry-c)

5)p5=Node(โ€˜-โ€™,p3,p4)

6)p6=Node(โ€˜*โ€™,p1,p5)

7)p7=Node(โ€˜+โ€™,p1,p6)

  1. p8=Leaf(id,entry-b)=p

  2. p9=Leaf(id,entry-c)=p

  3. p10=Node(โ€˜-โ€™,p3,p4)=p

  4. p11=Leaf(id,entry-d)

  5. p12=Node(โ€˜*โ€™,p5,p11)

  6. p13=Node(โ€˜+โ€™,p7,p12)

Note: When call is repeated, node

p2=p

Hash table for the nodes of

DAG for array

๏‚— Hash function h computes the index of the bucket.

๏‚— Buckets can be implemented as linked lists.

๏‚— Within the linked list for a bucket each cell holds the value

number of one of the nodes that hash into that bucket.

๏‚— If a match for value number v found in the cell we need to

check if matching input signature <op, l, r> can be found.

Three address code

๏‚— In a three address code there is at most one operator

at the right side of an instruction. Expression xyz

must be translated into i) t1=y*z; ii) t2=x+t

๏‚— Example: a+a(b-c)+(b-c)d

b

c

a

d

t1 = b โ€“ c

t2 = a * t

t3 = a + t

t4 = t1 * d

t5 = t3 + t

Forms of three address

instructions: procedure call

๏‚— Procedure calls using:

๏‚—

param x for parameters

๏‚—

call p,n for procedure call , where n : #parameters & p is

procedure

๏‚—

y = call p,n for function call , where y is return value

๏‚— Typical procedure call will be as below:

param x

param x

โ€ฆ

param xn

call p,n

Examples of translations to

three address code

do i = i+1; while (a[i] < v);

L: t1 = i + 1

i = t

t2 = i * 8

t3 = a[t2]

if t3 < v goto L

Symbolic labels

100: t1 = i + 1

101: i = t

102: t2 = i * 8

103: t3 = a[t2]

104: if t3 < v goto 100

Position numbers

Example

๏‚— a=b * minus c + b * minus c

t1 = minus c

t2 = b * t

t3 = minus c

t4 = b * t

t5 = t2 + t

a = t

Three address code

minus

minus c t

c t

b t1 t

b t3 t

t2 t4 t

t5 a

op arg1 arg2result

Quadruples

minus

minus c

c

b (0)

b (2)

a

op arg1 arg

Triples

minus

minus c

c

b (0)

b (2)

a

op arg1 arg

Indirect Triples

Instruction

Triples are compact, but quadruples have freedom to

move instructions without making changes to temp.

variables.

Indirect triples consist of listing of pointers to triples.

With this, reordering of instructions for optimization

can be done easily.

Static single assignment (SSA)

๏‚— y := 1; y := 2; x := y

๏‚— SSA form : y

1

:= 1 ; y

2

:= 2 ; x

1

:= y

2

๏‚— SSA in control flow has ฯ† function to combine 2 definitions

of a variable x for the true /false part.

๏‚— Compiler optimization algorithms which are either enabled

or strongly enhanced by the use of SSA include:

๏‚—

Constant propagation

๏‚—

Value range propagation

[3]

๏‚—

Sparse conditional constant propagation

๏‚—

Dead code elimination

๏‚—

Global value numbering

๏‚—

Partial redundancy elimination

๏‚—

Strength reduction

๏‚—

Register allocation

Three-address code for expressions

This translates a=b+-c into 3 address sequence

t1= minus c; t2=b+t1; a=t

Description

๏‚— Attributes S.code & E.code denote 3 address code

for statement S and expression E, respectively.

๏‚— Attribute E.addr denotes the address that will hold

the value for E.

๏‚— The last production E->id has value in id. Function

top. get retrieves the entry when it is applied to

the string representation id.lexeme of this

instance of id. E.code is set to the empty string.

Top denotes the current symbol table.

๏‚— When E -> (E1 ), the translation of E is the same as

that of the subexpression E1. Hence, E.addr

equals E1.addr, and E.code equals E1.code.

Translation of expressions (contd..)

When we translate the production E -> E1+E2, the semantic

rules build up E.code by concatenating E1.code, E2.code, and

an instruction that adds the values of E1 and E2. The

instruction puts the result of the addition into a new

temporary name for E, denoted by E.addr

The translation of E -> - E1 is similar. The rules create a new

temporary for E and generate an instruction to perform the

unary minus operation.

Finally, the production S-> id = E ; generates instructions that

assign the value of expression E to the identifier id. The

semantic rule for this production uses function top.get to

determine the address of the identifier represented by id, as

in the rules for E โ€”> id. S.code consists of the instructions to

compute the value of E into an address given by E.addr,

followed by an assignment to the address top.get(id.lexeme)

for this instance of id.

Incremental translation

๏‚— Code attributes can be long strings, so they are

generated incrementally. In incremental method we

generate only the new three-address instructions.

๏‚— Here gen not only constructs a three-address

instruction, it appends the instruction to the sequence of

instructions generated so far. The sequence may either

be retained in memory for further processing, or it may

be output incrementally.

๏‚— With the incremental approach, the code attribute is not

used, since there is a single sequence of instructions

that is created by successive calls to gen. For example,

the semantic rule for E ->E1+ E2 calls gen to generate

an add instruction; the instructions to compute E1 into

E1.addr and E2 into E2.addr have already been

generated.