Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Computer Architecture Assignment 01, Exams of Advanced Computer Architecture

Computer Architecture Assignment Questions

Typology: Exams

2018/2019

Uploaded on 10/18/2019

prince141286
prince141286 🇺🇸

2 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Problem 1 (20 points)
A program’s run time is determined by the product of instructions per program,
cycles per instruction, and clock frequency. Assume the following instruction mix
for a MIPS-like RISC instruction set: 15% stores, 25% loads, 15% branches, and
30% integer arithmetic, 5% integer shift, and 5% integer multiply. Given that
stores require one cycle, load instructions require two cycles, branches require four
cycles, integer ALU instructions require one cycle, and integer multiplies require
ten cycles, compute the overall CPI.
Ans:
Overall CPI = (15%*1) + (25%*2) + (15%*4) +(30%*1) + (5%*1) + (5%*10)
= 2.10
Problem 2 (15 points)
MIPS (millions of instructions per second) was commonly used to gauge computer
system performance up until the 1980s. Explain why it can be a very poor measure
of a processor’s performance. Are there any circumstances under which it is a valid
measure of performance? If so, describe those circumstances.
Ans: MIPS (millions of instructions per second) can be a very poor measure of a
processor’s performance because it doesn’t take into account other factors such as
instruction set architecture (ISA) used, or the count of instructions executed.
There are certain circumstances under which it is a valid measure of performance
such as:
1. The same program is used
2. The same ISA is used
3. The same compiler is used
pf3
pf4
pf5

Partial preview of the text

Download Computer Architecture Assignment 01 and more Exams Advanced Computer Architecture in PDF only on Docsity!

Problem 1 (20 points)

A program’s run time is determined by the product of instructions per program, cycles per instruction, and clock frequency. Assume the following instruction mix for a MIPS-like RISC instruction set: 15% stores, 25% loads, 15% branches, and 30% integer arithmetic, 5% integer shift, and 5% integer multiply. Given that stores require one cycle, load instructions require two cycles, branches require four cycles, integer ALU instructions require one cycle, and integer multiplies require ten cycles, compute the overall CPI.

Ans:

Overall CPI = (15%1) + (25%2) + (15%4) +(30%1) + (5%1) + (5%10)**

= 2.

Problem 2 (15 points)

MIPS (millions of instructions per second) was commonly used to gauge computer system performance up until the 1980s. Explain why it can be a very poor measure of a processor’s performance. Are there any circumstances under which it is a valid measure of performance? If so, describe those circumstances.

Ans: MIPS (millions of instructions per second) can be a very poor measure of a processor’s performance because it doesn’t take into account other factors such as instruction set architecture (ISA) used, or the count of instructions executed.

There are certain circumstances under which it is a valid measure of performance such as:

  1. The same program is used
  2. The same ISA is used
  3. The same compiler is used

Problem 3 (15 points)

MFLOPS (millions of floating-point operations per second) was commonly used to gauge computer system performance up until the 1980s. Explain why it can be a very poor measure of a processor’s performance. Are there any circumstances under which it is a valid measure of performance? If so, describe those circumstances.

Ans: MFLOPS(millions of floating-point operations per second) can be a very poor measure of a processor’s performance because floating-point operations is not consistent across computers, and the number of actual floating-point operations performed may vary, or the MFLOPS rating changes according not only to the mixture of integer and floating-point operations but to the mixture of fast and slow floating-point operations.

It is a valid measure of performance, if we define a method of counting the number of floating-point operations in a high-level language program. This counting process can also weight the operations, giving more complex operations larger weights, allowing a computer to achieve a high MFLOPS rating even if the program contains many floating-point divides. These MFLOPS might be called normalized MFLOPS. Of course, because of the counting and weighting, these normalized MFLOPS may be very different from the actual rate at which a computer executes floating-point operations.

Problem 4 (30 points)

You are given the following benchmark code:

double A [1024], B [1024], C [1024];

for (int i=0; i<1000; i += 2) {

A[i] = 35.0 * B[i] + C[i+1];

}

4.3) The first iteration accesses memory location &B[0], &C[1], and &A[0]. Unfortunately, since the arrays are consecutive in memory, these locations are exactly 8 KB (1024 x 8 B per double) apart. Hence, in a two-way set-associative cache they conflict, and the access to A[0] will evict B[0]. In the second iteration, the access to B[1] will evict C[1], and so on. However, since the access to C is offset by 1 double (8 bytes), in the seventh iteration it will access C[8], which does not conflict with B[7]. Hence, B[7] will hit, as will A[7]. In the eighth iteration, C[9] will also hit, but now B[8] and A[8] will again conflict, and no hits will result. Hence, there are three hits every eight iterations, leading to a total number of hits of floor(1000/8)*3 = 375 hits. The number of misses is 3000-375 = 2625, for an overall miss rate of 87.5% misses per reference.

Problem 4 (20 points)

Consider a processor with 32-bit virtual addresses, 4KB pages and 36-bit physical addresses. Assume memory is byte-addressable (i.e. the 32-bit VA specifies a byte in memory).

  • L1 instruction cache: 64 Kbytes, 128 byte blocks, 4-way set associative, indexed and tagged with virtual address.
  • L1 data cache: 32 Kbytes, 64 byte blocks, 2-way set associative, indexed and tagged with physical address, write-back.
  • (^) 4-way set associative TLB with 128 entries in all. Assume the TLB keeps a dirty bit, a reference bit, and 3 permission bits (read, write, execute) for each entry.

Specify the number of offset, index, and tag bits for each of these structures in the table below.

Structure Offset bits Index bits Tag bits Size of tag array

Size of data array I-cache D-cache TLB

Also, compute the total size in number of bit cells for each of the tag and data arrays.

Ans:

Structure Offset bits Index bits Tag bits Size of tag array

Size of data array I-cache 7 7 18 D-cache 6 8 22 TLB 7