






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Computer Architecture Assignment Question
Typology: Exams
1 / 12
This page cannot be seen from the preview
Don't miss anything!
Assembly Language Machine Language Comments
add $t0, $s1, $s2 000000 10001 10010 01000 00000 100000
sll $t2, $s7, 4 000000 00000 10111 01010 00010 000000
sw $t4, 0 ($t2) 101011 01010 01100 0000000000000000
memory address ($t2+0)
j 0x092A 000010 0000000000 0000 1001 0010 1010
16’h092A, 2’b00}
Solution: (1) One example of MIPS code: clear $t0; addi $s0, $zero, 100 loop: lw $t1, 0($a1) # $t1 = b[i] add $t1, $t1, $s0 # $t1 = a[i] sw $t1, 0($a0) # store $t1 to address of a[i] addi $a0, $a0, 4 # $a0 = address of a[i+1] addi $a1, $a1, 4 # $a0 = address of a[i+1] addi $t0, $t0, 1 # $t0 = $t0 + 1 beq $t0, $s0, finish # if ($t0 = 100) finish j loop finish: (Different program with same behavior can get full grade, too.)
(2) 2 instructions before loop are executed 1 time; 7 instructions between loop and beq are executed 101 times; instruction “j loop” executed 100 times. Therefore:
Total instructions executed (in this case): 21 + 7101 + 1*100 = 809.
(3) Memory data reference: 101 * 2 = 202.
instruction set. There are five classes of instructions (A, B, C, D, and E) in the instruction set. P1 has a clock rate of 4 GHz. P2 has a clock rate of 6 GHz. The average number of cycles for each instruction class for P1 and P2 is as follows:
Class CPI on P1 CPI on P A 1 2 B 2 2 C 3 2 D 4 4 E 3 4
(1) (5 pts.) Assume that peak performance is defined as the fastest rate that a computer can execute any instruction sequence. What are the peak performances of P1 and P2 expressed in instruction per second? (2) (5 pts.) If the number of instructions executed in a certain program is divided equally among the classes of instructions except for class A, which occurs twice as often as each of the others, how much faster is P2 than P1?
Solution: (1) For both P1 and P2, the maximum IPC are 1 and 0.5, respectively. IPS = IPC * # of cycles in 1s Therefore IPS for P1 = 1 * 4G = 4e9; IPS for P2 = 0.5 * 6G = 3e9;
(2) Average CPI = (2A+B+C+D+E) / (2+1+1+1+1). Here A, B, C, D, E refers to the CPI of each instruction on one of the implementation. On P1, CPI = (2+2+3+4+3)/6 = 14/6; Avg. time to execute 1 instruction: (14/6)/4GHz On P2, CPI = (4+2+2+4+4)/6 = 16/6; Avg. time to execute 1 instruction: (16/6)/6GHz
(Speed of P2) / (Speed of P1) = (Avg. time per instruction on P1) / (Avg. time per instruction on P2) = 21/16 or 1. Therefore P2 is 1.31 times faster than P1.
(1) (10 pts.) You are going to enhance a computer, and there are two possible improvements: either make multiply instructions run four times faster than before, or make memory access instructions run two times faster than before. You repeatedly run a program that takes 100 seconds to execute. Of this time, 20% is used for multiplication, 50% for memory access instructions, and 30% for other tasks. What will the speedup be if you improve only multiplication? What will the speedup be if you improve only memory access? What will the speedup be if both improvements are made?
(2) (5 pts.) You are going to change the program described in (1) so that the percentages are not 20%, 50%, and 30% anymore. Assuming that none of the new percentages is 0, what sort of program would result in a tie (with regard to speedup) between the two individual improvements? Provide both a formula and some examples.
Solution:
(1) The times used for multiply and memory access are 20s and 50s, respectively.
Accelerate only multiply: (100-20)+20/4 = 85 sec;
Accelerate only memory access: (100-50)+50/2 = 75 sec;
Accelerate multiply and memory access: (100-20-50) +20/4+50/2 = 50 sec.
(2) Suppose the percentage of time used by multiply is a%, the percentage of memory access is b%. To make a tie between two individual improvements, we will have
(100-a) + a/4 = (100-b) + b/
=> b = 1.5a (0<a, b<100)
For example, if multiply consumes 20% of time, then only when memory references instruction consumes 30% of time will the improvements have equal effect.