Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

CS140 Lab 06: Determining Processor and Compiler CPI and Comparing C and Java Performance, Slides of Computer Architecture and Organization

A lab guide for cs140 students to gain understanding of processors and their behavior. The lab focuses on calculating the cpi (cycles per instruction) for a given program using both c and java, and comparing their performance. Students are required to determine the number of instructions executed per loop in each mode and measure the influence of the operating system on the performance.

Typology: Slides

2012/2013

Uploaded on 04/24/2013

baijayanthi
baijayanthi 🇮🇳

4.5

(13)

171 documents

1 / 36

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS140 - Lab06 1
Computer Organization
CS 140
Processors
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24

Partial preview of the text

Download CS140 Lab 06: Determining Processor and Compiler CPI and Comparing C and Java Performance and more Slides Computer Architecture and Organization in PDF only on Docsity!

CS140 - Lab06 (^) 1

Computer Organization

CS 140

Processors

CS140 - Lab06 (^) 2

The purpose of this lab is to gain understanding of processors and their behavior. You will do this by writing code that “tickles” the processor, making it interact in a particular way; and from this you learn characteristics about that processor. You’re doing experiments on the hardware!

In this lab, you have five very specific tasks.

  1. Modify my starter code in C. Run that code on three different processors in order to determine CPI. Compare your results.
  2. Compare your C and Java code on the same machine. How do they perform relative to each other?
  3. Compare how your code runs in a dual core environment.
  4. How does your code run on Windows vs. Linux on the same machine.
  5. Use a supplied program to measure the performance penalty resulting from a control hazard in your code.

The Lab - Overview

CS140 - Lab06 (^) 4

Background: Useful Information

A note on compiler efficiency. I often compile code with the ability to debug. The “-g” switch does this. So if I say “gcc –g Prog06.c –o Prog06” I get an executable where I can use gdb – and a debugger is the greatest timesaver known to the programming world.

But this doesn’t produce efficient code – to get efficiency you need to compile with optimization “-O4”. While we’re on the topic of switches, the “-S” switch tells the compiler to produce assembler output and NOT to produce an executable.

The cycle speed of your computer can be determined as follows: On LINUX  the file /proc/cpuinfo contains the rated processor speed, as well as lots of other goodies. On Windows Start  Control Panel  System tells you processor speed.

Appendix A.2 shows how to find the number of instructions in a cycle.

CS140 - Lab06 (^) 5

Warm-up

The code for my C and java programs is available to you – no need to type in lots of instructions. They can be found at Prog06.c and Prog06.java. Here you can also find Prog06.s and Prog06.exe The source code for these programs is also listed in Appendices A and B of this document.

The Task asked of you here is to determine the processor/compiler CPI. I’m showing you how I did this task so you can see a concrete example

Look at my Prog06.c and resulting .s file. My code accomplishes a very simple task

  • just finding the sum of the numbers between 1 and 1,000,000.
  • Try calculating the CPI of my code:
  • Figure out how many instructions are required to accomplish the 1  million sum.
  • Time the number of seconds to accomplish that task.
  • What is the speed of the processors – the number of cycles/second.
  • Calculate the CPI
  • ALL THIS IS JUST WARMUP.

CS140 - Lab06 (^) 7

Task 1: Determine the Program CPI

1. This is a repeat of the warmup on the last few slides: Look at my sample C and java programs. My code accomplishes a very simple task – just finding the sum of the numbers between 1 and 1,000,000. - Try calculating the CPI of my code: - Figure out how many instructions are required to accomplish the 1  million sum. - Time the number of seconds to accomplish that task. - What is the speed of the processors – the number of cycles/second. - Calculate the CPI - ALL THIS IS JUST WARMUP, because you should

  1. Determine the CPI of a piece of your OWN code – and then plan experiments as explained later.
    • Write your own code to accomplish your own simple task. For example, my code does Global += index; Yours could do Global *= Index; or Global += Index / 93; or Global += Index * Index; You get the idea – there are many possibilities. Try not to pick something that your buddies are doing since there are so many choices.
    • Follow the example to determine the CPI of your own code.

CS140 - Lab06 (^) 8

Task 1: Example Calculation

Here’s how I filled in the sheet when I ran a part of a Task 1 experiment.

4,000,000,000 / 0.937 sec = 4,268,943,436 / sec

See Appendix A

(2,160,000,000 cycles/sec ) / (4,268,943,436 instr. /sec)

Task 1: CPI Calculation Program Run Compiler Used

Operating System

Original Prog06.exe

gcc Windows XP

Processor Name

Processor Cycle Speed

Instructions per loop

Loops per reporting period

Seconds in reporting period

Instructions executed per sec.

CPI

AMD Athlon XP 3000+

2,160,000, / sec or 2.16 GHz

4 1,000,000,000 0.937seconds 4,268,943,436 0.

CS140 - Lab06 (^) 10

Background: Experimental Design

Now you know that in any experimental design, when you have a whole list of things you can change, it’s important to change only one variable at a time.

What are the variables you have in this experiment? a) The processor and therefore the processor cycles/second b) The program to be run – we’ve discussed three options for this so far –

  • Program.exe
  • Program – a LINUX executable (could be an apple executable also) c) The compiler used for your code. The compiler you run to get Program.exe might not be the one you use to get a LINUX executable. d) The Operating System on which you run. e) And there might be OTHER variables I haven’t thought of here.

The POINT IS – when you design your experiment to compare three processors, you need to keep other variables the same.

CS140 - Lab06 (^) 11

Task 1: Determine the Program Performance

Reporting: In the Show and Tell section, you will find a table like this to be filled out for your three different machines.

Task 1: CPI Calculation Program Run

Compiler Used

Operating System

Processor Name

Processor Cycle Speed

Instructions per loop

Loops per reporting period

Seconds in reporting period

Instructions executed per sec.

CPI

CS140 - Lab06 (^) 13

Task 2: Performance of C and Java

We want to compare the performance of C and Java. Keeping in mind our need to change only one variable, this experiment needs to be done on the same machine – you want to change ONLY language. Completion time means how long it takes to accomplish equal work. Timing should be for several iterations and should last for about 30 seconds each to ameliorate initial effects.

The steps are simple here:

  1. Modify both the C and Java program – the two codes should perform identical tasks, but those tasks must be different from my starter code.
  2. Run those programs on a machine, keeping other variables as consistent as possible.
  3. I’ve filled in some possible values in the table above.
  1. Compare C and Java Performance Processor Name

Processor Cycle Speed

Program run Compiler Used

Operating System

Completion Time (secs) Intel Pentium 4 Core Duo

2.6 GHz MyCode.c Linux / Windows MyCode.java

CS140 - Lab06 (^) 14

Task 3: Determine the Program Performance

The machines we have in the lab are dual core processors. You should read more about this, but the essence is that the processors can run two programs simultaneously. I would like to know what happens when you run two copies of your program at the same time – what is the execution time for each of the programs when running in this mode? How do the 2 executions interfere? Why?

To facilitate running multiple copies of a program, you should learn about “&” and running in the background. By the way, DO NOT RUN ON CSGATEWAY (younger) since this will clog up the machine. You should also learn some Useful LINUX commands including “&”, “ps aux”, “kill ”, “killall Prog06”

  1. Run a program on a lab machine in single and dual-core mode Processor Name

Processor Cycle Speed

Program run Compiler Operating System

Completion Time (secs) C – 1 copy C – 2 copies Java – 1 copy Java – 2 copies

CS140 - Lab06 (^) 16

Task 4: Determine the OS Influence

So here’s the question. Should the Operating System make any difference when running your code? There is of course only one way to find out the real answer – measure it. To do this, run your C or Java program on the same machine, but change only the OS that’s running. It’s possible to do this with our dual-booted lab machines.

4: Measure the influence of the Operating System

Processor Name

Processor Cycle Speed

Program run Operating System

Compiler Completion Time

Intel Pentium 4 Core Duo

2.6 GHz MyCode.c OR MyCode.java

Linux etc.

Linux etc.

CS140 - Lab06 (^) 17

Task 5: The cost of a control hazard.

Your task is to do the following:

Determine the number of processor cycles that are lost when a branch is mispredicted. You want one number – BUT there are a number of steps and lots of required documentation to get this number.

What you are given?

Here you work only with the program I’m providing. Branch.c can be downloaded and the source is also available in Appendix C of this document. You are also given one (1) brain.

What does the program do?

This program provides a way to outsmart the processor – to ensure that the processor can not predict whether a branch will be taken or not. This is difficult to do – the processor is VERY smart and only by going to random branching can we fool it. Look at the code in Appendix C to understand how this is done.

CS140 - Lab06 (^) 19

Task 5: The cost of a control hazard.

Step 4:

Fill in the table on the next page. Bring it to lab.

Step 5:

Determine the number of cycles in an instruction loop in branch.s when the code is correctly predicted and when it is miss-predicted. How many cycles does the Pentium 4 lose on that one instruction as a result of miss-prediction?

Step 6:

When all else fails, and you don’t understand these instructions, don’t blindly follow your misunderstanding. Go back to the previous pages and understand the goal. Then do it the best you can. This is difficult stuff!

CS140 - Lab06 (^) 20

Task 5: The cost of a control hazard.

Branch Misprediction Work Sheet Mode Hz Execution Time

Instr / Loops (using stepi)

Instr loop

Instr / second

Machine Cycles / Loop

CPI

This table can help you with the calculation

Mode – Which mode (0,1,2) we’re running in. Hz -- What is the cycle speed of the processor (normally in GHz) Execution Time – How long does it take to execute the program Instr / Loops - From the Appendix C2, how many instructions are there in a loop? Instr loop - How many instruction loops were executed in the reported time. Instr / second – Calculate the number of instructions executed in one second. Machine Cycles/Loop – How many machine cycles were required to execute the number of instructions in one loop? CPI – what is the CPI for the program running in this mode?