Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Pipelining Basics- Lecture Slides, Slides of Computer Architecture and Organization

Indraprastha Institute of Information Technology Computer Architecture and Organization

A Presentation slides for basics of pipelining

Typology: Slides

2016/2017

Uploaded on 11/08/2017

shashank95 🇮🇳

(1)

5 documents

1 / 46

This page cannot be seen from the preview

Don't miss anything!

CMPUT 429 - Computer Systems a

nd Architecture

CMPUT429/CMPE382 Winter

2001

Topic3-Pipelining

José Nelson Amaral

(Adapted from David A. Patterson’s CS252

lecture slides at Berkeley)

Partial preview of the text

Download Pipelining Basics- Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

CMPUT 429 - Computer Systems a 1

CMPUT429/CMPE382 Winter

Topic3-Pipelining

José Nelson Amaral (Adapted from David A. Patterson’s CS lecture slides at Berkeley)

What is Pipelining?

Pipelining is a key implementation technique used to build fast processors. It allows the execution of multiple instructions to overlap in time.

A pipeline within a processor is similar to a car assembly line. Each assembly station is called a pipe stage or a pipe segment.

The throughput of an instruction pipeline is the measure of how often an instruction exits the pipeline.

Sequential Laundry

 (^) Sequential laundry takes 6 hours for 4 loads  (^) If they learned pipelining, how long would laundry take?

A

B

C

D

6 PM 7 8 9 10 11 Midnight

T a s k O r d e r

Time

Pipelined Laundry

Start work ASAP

 (^) Pipelined laundry takes 3.5 hours for 4 loads

A

B

C

D

6 PM 7 8 9 10 11 Midnight

T a s k O r d e r

Time

30 40 40 40 40 20 What is preventing them from doing it faster?

How could we eliminate this limiting factor?

M u x

0 1

M u x

0 1

M u x

0 1 2 3

0 1 2

M u x

Sign ext.

Shift left 2

Conc/ left 2^ Shift

Read address Write address Write data^ MemData

Instruction [31-26] Instruction [25-0] Instruction register

Memory

Read register 1 Read register 2 Write register Write data

data 1^ Read data 2^ Read Registers 4 32

M u x

0 1 0 1

M u x

result^ ALU

Zero ALU

Target

4 26

I[25-21] I[20-16]

I[15-0]

[15-11]

5 Steps of MIPS Datapath

Figure 3.1, Page 130, CA:AQA 2e Memory Access

Writ e Back

Instruction Fetch

Instr. Decode Reg. Fetch

Execute Addr. Calc

L M D

ALU

MUX

Memory Reg File^ MUX (^) MUX^ Memory Data MUX

Extend^ Sign

Adder

Zero?

Next SEQ PC

Address

Next PC

WB Data

Inst (^) RD

RS RS

Imm

Steps to Execute Each

Instruction Type

Step I nstruction Type R-type load store branch jump Fetch I R (^)  Memory[PC] PC (^)  PC + 4 Decode A (^)  Registers[I R[25-21]] B (^)  Registers[I R[20-16]] Target (^)  PC + (sign-extend(I R[15-0]) << 2) Execute ALUopt (^)  A op B

ALUout (^)  A + sign-extend(I R[15-0])

I f(A==B) then PC  Target

PC (^)  concat(PC[31- 28], IR[25-0]) << 2 Memory Reg(R[15- 11]) (^)  ALUout

Memdata (^)  Mem[ALUout]

Mem[ALUout]  B Write- back

Reg(I R[20- 16]) (^)  memdata

Pipeline Stages

We can divide the execution of an instruction into the following stages:

IF: Instruction Fetch ID: Instruction Decode EX: Execution MEM: Memory Access WB: Write Back

Pipeline Throughput and

Latency

IF ID EX MEM WB

5 ns 4 ns 5 ns 10 ns 4 ns Pipeline throughput: how often an instruction is comple

instr ns

instr ns ns ns ns ns

T instr lat IF lat ID lat EX lat MEM lat WB

1 / 10

1 /max 5 , 4 , 5 , 10 , 4

1 /max ( ), ( ), ( ), ( ), ( )



Pipeline latency: how long does it take to execute an instruction in the pipeline.

ns ns ns ns ns ns

L lat IF lat ID lat EX lat MEM lat WB 5 4 5 10 4 28

( ) ( ) ( ) ( ) ( )      

     Is this right?

Pipeline Throughput and

Latency

IF ID EX MEM WB

5 ns 4 ns 5 ns 10 ns 4 ns Simply adding the latencies to compute the pipelin latency, only would work for an isolated instruction

I1 IF ID EX MEM WB L(I1) = 28ns I2 IF ID EX MEM WBL(I2) = 33ns I3 IF ID EX MEM WBL(I3) = 38ns I4 IF ID MEM L(I5) = 43ns

EX WB We are in trouble! The latency is not constant. This happens because this is an unbalanced pipeline. The solution is to make every state the same length as the longest one.

Pipeline Throughput and

Latency

IF ID EX MEM WB

5 ns 4 ns 5 ns 10 ns 4 ns How long does it take to execute 20000 instructions in this pipeline? (disregard bubbles caused by branches, cache misses, and hazards)

How long would it take using the same modules without pipelining?

ExecTime (^) pipe  20000  10 ns  200000 ns  200  s

ExecTime (^) non  pipe  20000  28 ns  560000 ns  560  s

Pipeline Throughput and

Latency

IF ID EX MEM WB

5 ns 4 ns 5 ns 10 ns 4 ns

Thus the speedup that we got from the pipeline is:

s

ExecTime

Speedup

pipe

non pipe pipe 



How can we improve this pipeline design?

We need to reduce the unbalance to increase the clock speed.

Pipeline Throughput and

Latency

IF ID EX MEM1^ WB

5 ns 4 ns 5 ns 5 ns 4 ns

MEM 5 ns

I1 IF ID EX MEM1MEM1WB I2 IF ID EX MEM1MEM1WB I3 IF ID EX MEM1MEM1WB I4 IF ID EX MEM1MEM1WB I5 IF ID EX MEM1MEM1WB I6 IF ID EX MEM1MEM1WB I7 IF ID EX MEM1MEM1WB

Pipeline Throughput and

Latency

IF ID EX MEM1^ WB

5 ns 4 ns 5 ns 5 ns 4 ns

MEM 5 ns

ExecTime (^) pipe  20000  5 ns  100000 ns  100  s

How long does it take to execute 20000 instructions in this pipeline? (disregard bubles caused by branches, cache misses, etc, for now)

Thus the speedup that we get from the pipeline is:

s

ExecTime

Speedup

pipe

non pipe pipe 



Pipelining Basics- Lecture Slides, Slides of Computer Architecture and Organization

Related documents

Partial preview of the text

Download Pipelining Basics- Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

CMPUT429/CMPE382 Winter

Topic3-Pipelining

What is Pipelining?

Sequential Laundry

A

B

C

D

Pipelined Laundry

Start work ASAP

A

B

C

D

5 Steps of MIPS Datapath

Steps to Execute Each

Instruction Type

Pipeline Stages

Pipeline Throughput and

Latency

IF ID EX MEM WB

Pipeline Throughput and

Latency

IF ID EX MEM WB

Pipeline Throughput and

Latency

IF ID EX MEM WB

Pipeline Throughput and

Latency

IF ID EX MEM WB

s

s

ExecTime

ExecTime

Speedup

Pipeline Throughput and

Latency

IF ID EX MEM1^ WB

Pipeline Throughput and

Latency

IF ID EX MEM1^ WB

s

s

ExecTime

ExecTime

Speedup