









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
University of california, berkeley cs 152 midterm i exam for computer architecture and engineering. Contains questions related to instruction cpi, compiler improvements, execution time, microprocessor costs, and logic gates. Students are required to calculate averages, ratios, and delays.
Typology: Exams
1 / 15
This page cannot be seen from the preview
Don't miss anything!
University of California at Berkeley College of Engineering Computer Science Division - EECS
CS 152 D. Patterson S. Kong Spring 1995
Computer Architecture and Engineering Midterm I
Your Name: SID Number: Discussion Section:
You may bring two pages of notes. You have 180 minutes. Each question carries 20 points. Show your work. Write neatly and be well organized. Good Luck!
Problem Score
1
2
3
4
5
Total
The clock rate for your machine is 100MHz. Your machine has a floating-point unit. The video game has 500 million total instructions.
Here are the measurements for your video game:
Instruction CPI Frequency A 2 35% B 5 30% C 4 20% D 4 15%
a) What is the average CPI for your machine when running this program? [4 points]
b) A friend of yours gives you a new compiler to try out. The instruction count improvements resulting from this compiler are as follows. [4 points]
Instruction Class Percentage of Instructions Executed vs. Original Machine A 80% B 100% C 95% D 80%
What is the average CPI for your machine when running your program as compiled with this new compiler?
Extra Credit: What is your opinion of this friend?
The first option is a Galium Arsenide (GaAs) microprocessor. A GaAs wafer that is 10cm (. 4 inches) in diameter costs $2000. The manufacturing process creates 4 defects per square centimeter. The micorprocessor fabricated in this technology is expected to have a clock cycle rate of 1000MHz, with an average clock per instruction of 2.0 if we assume an infinitely fast memory system. The size of the GaAs microprocessor is 1.0cm by 1.0cm. (Assume " is the same for GaAs as CMOS = 2.0.)
The second option is a CMOS microprocessor. A 15cm (. 6 inch) wafer with 1 defect per square centimeter costs $1000. The 1.0cm by 2.0cm microprocessor executes multiple instructions per clock cycle to achieve an average clock cycles per instruction of 1.0, assuming an infinitely fast memory while achieving a clock rate of 200MHz. (The microprocessor is larger because of on chip caches and executing multiple instructions per clock cycle.)
Neither wafer has test dies on them.
Here are two equations you may find useful:
a) What is the cost of an untested GaAs die for this microprocessor? Show your work. [4 points]
b) What is the cost of an untested die for the CMOS microprocessor? Show your work. [4 points]
c) What is the ratio of cost of the GaAs die to the CMOS die? [1 points]
d) Calculate the average time for each instruction with an infinitely fast memory. Which is faster and by what factor? Show your work. [3 points]
e) Based on costs and performance ratios of the microprocessors calculated above, what is the ratio of cost/performance of the CMOS to GaAs microprocessors? [3 points]
Input Load = 50 fF Internal Delay (Output going from Low to High) = TPlh = 0.3ns Internal Delay (Output going from High to Low) = TPhl = 0.3ns Load Dependent Delay (Output going from L to H) = TPlhf = 0.010ns/fF Load Dependent Delay (Output going from H to L) = TPhlf = 0.002ns/fF
Input Load = 50 fF Internal Delay (Output going from Low to High) = TPlh = 0.3ns Internal Delay (Output going from High to Low) = TPhl = 0.3ns Load Dependent Delay (Output going from L to H) = TPlhf = 0.002ns/fF Load Dependent Delay (Output going from H to L) = TPhlf = 0.010ns/fF
a) Assume we need to use one to these drivers to drive a long wire with high capacitive load (see diagram below) and our goal is to have a VERY FAST High to Low transition at the output while the Low and High transigion can be slow. Which gate should we use and why? [3 points]
b) Now assume the gate you chose from above is driven by the following logic:
i. In order to cause Out to make a High to Low transition, what kind of transition does Wire 2 have to make (H to L or L to H)? [2 points]
ii. In order to propagate a signal from Wire 1 to Wire 2, what value do we need to have at Q (0 or 1)? [2 points]
iii. In order to cause Out to make a High to Low transition, what kind of transition does Wire 1 have to make (H to L or L to H)? [2 points]
iv. In order to propagate a signal from In to Wire 1, what value do we need to have at P (0 or 1)? [ points]
v. In order to cause Out to make a High to Low transition, what kind of transition do we have to apply at In (H to L or L to H)? [2 points]
C Program
#include <stdio.h> #include <stdlib.h>
extern void BubbleSort(int, int); extern void PrintResults(int, int);
void main() { int test1[10] = {5 ,4, 3, 2, 1, 6}; int test2[10] = {1, 3, 5, 3, 6, 5}; int test3[10] = {5, 4, -4, 13, -1, 3, 2, 1, 6}; int test4[10] = {5}; int test5[10];
BubbleSort(test1, 6); PrintResults(test1, 6);
BubbleSort(test2, 6); PrintResults(test2, 6);
BubbleSort(test3, 9); PrintResults(test3, 9);
BubbleSort(test4, 1); PrintResults(test4, 1);
BubbleSort(test5, 0); PrintResults(test5, 0); }
void PrintResults(int *values, int numEntries) { int i; if (numEntries == 0) { printf("empty\n"); return; }
for (i = 0; i < numEntries; i++) printf("%d ", values[i]); printf("\n"); }
/* sorts normal size integers only */ void BubbleSort(int *input, int numEntries) { int temp, store; int i, j;
if (numEntries == 0) return;
for (i = 0; i < numEntries; i++) { temp = i; for (j = i; j < numEntries; j++) { if (input[j] < input[temp]) { temp = j; } } store = input[i]; input[i] = input[temp]; input[temp] = store; }
return; }
exit: j r31 #return nop
swap: add r11, r10, r0 #temp <- j j cont nop
One example of a new MIPS instruction that uses the SALU would be add3 rd, rs, rt, ra ; Reg[rd] = Reg[rs] + Reg[rt] + Reg[ra] What other changes would you make to the instruction set to take advantage of the SALU?
a) What other instructions would you add? Use the notation above, and explain why they might be useful. [4 points]
b) What new data addressing mode(s) would you add? Why would they be useful? [3 points]