Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Yottabyte big -Introduction to Database Systems - Exams, Exams of Introduction to Database Management Systems

Main points of this past exam are: Yottabyte Big, Size Limits, Function, Recommend, Maxsize, Guarantee, Large Tables, Advertising Campaign, Intergalactic, Lecture Analyzing

Typology: Exams

2012/2013

Uploaded on 04/02/2013

shalin_p01ic
shalin_p01ic 🇮🇳

4

(7)

86 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
UNIVERSITY OF CALIFORNIA
Department of EECS, Computer Science Division
CS186
Hellerstein
Fall 2007
Midterm Exam
Midterm Exam: Introduction to Database Systems
This exam has five problems, worth 20 points each. Each problem is made up of multiple questions. You
should read through the exam quickly and plan your time-management accordingly. Before beginning to
answer a problem, be sure to read it carefully and to answer all parts of every problem!
You must write your answers on the exam, in the spaces provided. Do not tear pages off of your exam!
Good luck!
1. Sorting
Zombo.com has begun advertising “sorting as a service” for their website. Users can upload a table of
numbers to Zombo.com, and the site will sort it for them and email it back. Unfortunately, after
launching their intergalactic advertising campaign, the executives at Zombo.com realize that they have
no idea how to deal with large tables. They hire you as a consultant to work this out for them.
a) [6 points] Zombo.com sorts one file at a time. You promise the executives that the I/O count for
sorting the file (including reading it from disk and writing the result to disk) will be 4 times that of
simply reading it. To guarantee this, you suggest introducing a limit on the size of file uploads to
some number maxsize bytes per file. The Zombo.com servers use 8K disk blocks, and have
128Kbytes of memory available for sorting. What value should you recommend for maxsize (in
Kb!)?
b) [6 points] The executives are unhappy with the idea of size limits. They remind you that “the
unattainable is unknown at Zombo.com!” They ask you how many disk I/Os it would take to sort a
file that is one Yottabyte big on a single server. Please state your answer as a function of the value Y
(where Y = 1 Yottabyte).
3 points for using a log of any kind.
1 point for log base 15.
1 point for the 2Y factor.
1 point for dividing by 8K and 16 appropriately.
Name: ____________________________________
Class Account:______________________________
(16*15 pages)*8KB/page
= 1920 KB
2(Y/8K) * (1 + log15(Y/(8K*16)) I/Os
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Yottabyte big -Introduction to Database Systems - Exams and more Exams Introduction to Database Management Systems in PDF only on Docsity!

Department of EECS, Computer Science DivisionUNIVERSITY OF CALIFORNIA

CS186 Hellerstein

Fall 2007 Midterm Exam

This exam has five^ Midterm Exam: Introduction to Database Systems problems, worth 20 points each. Each problem is made up of multiple questions. You

should read through the exam quickly and plan your time-management accordingly. Before beginning to answer a problem, be sure to read it carefully and to answer all parts of every problem! You must write your answers on the exam, in the spaces provided. Do not tear pages off of your exam! Good luck!

1. Sorting Zombo.com has begun advertising “sorting as a service” for their website. Users can upload a table of numbers to Zombo.com, and the site will sort it for them and email it back. launching their intergalactic advertising campaign, the executives at Zombo.com realize that they have Unfortunately, after no idea how to deal with large tables. They hire you as a consultant to work this out for them. a) [6 points] sorting the file (including reading it from disk and writing the result to disk) will be Zombo.com sorts one file at a time. You promise the executives that the I/O count for 4 times that of simply reading it. To guarantee this, you suggest introducing a limit on the size of file uploads to some number maxsize bytes per file. The Zombo.com servers use 8K disk blocks, and have 128Kbytes of memory Kb!)? available for sorting. What value should you recommend for maxsize (in b) [6 points] unattainable is unknown at Zombo.com!” They ask you how many The executives are unhappy with the idea of size limits. They remind you that “the disk I/Os it would take to sort a file that is one Yottabyte big (where Y = 1 Yottabyte). on a single server. Please state your answer as a function of the value Y 3 points for using a log of any kind. 1 point for log base 15. 1 point for the 2Y factor. 1 point for dividing by 8K and 16 appropriately. Name: ____________________________________ Class Account:______________________________

(1615 pages)8KB/page

= 1920 KB

2(Y/8K) * (1 + log 15 (Y/(8K*16)) I/Os

c) [8 points] that the output must be written on the same disk as the input, but that you have another disk available A frequent user of the service uploads 256KB files that are often already sorted. Assume to you for scratch space. Assume also that you choose to use QuickSort in memory. Fill in the following table describing the I/O behavior when they upload a file that happens to already be sorted: # of Random I/Os # of Sequential I/Os Pass 0 Read Pass 0 Write 1 (or 0)1 (or 0) 31 (or 32)31 (or 32) Pass 1 Read Pass 1 Write 3 (or 4) 1 29 (or 28) 31 Sequential Mostly: 4 points 32 Blocks/Row: 1 points Pass 1 Read: 2 points Pass 1 Write: 1 point

c) [5 points] The following SQL query is given: SELECT b.title, s.edition FROM book b, in_stock s WHERE b.isbn = s.isbn and s.lib_name = 'Cao Library' and NOT EXISTS (SELECT * FROM WHERE in_stock i.lib_name = i 'Evans Library' and i.quantity > s.quantity ); Which of the following queries will produce a different result set? i) SELECT b.title, s.edition FROM book b, in_stock s WHERE b.isbn = s.isbn and s.lib_name = 'Cao Library' s.quantity NOT IN (SELECT distinct quantity and FROM WHERE in_stock i.lib_name i = 'Evans Library'); ii) SELECT b.title, s.edition FROM book b, in_stock s WHERE b.isbn = s.isbn and s.lib_name = 'Cao Library' s.quantity > (SELECT MAX(quantity) and FROM in_stock WHERE i.lib_name = i 'Evans Library'); iii) (^) FROM book b, in_stock sSELECT b.title, s.edition WHERE b.isbn = s.isbn and s.lib_name = 'Cao Library' s.quantity > ALL (SELECT quantity and FROM in_stock WHERE i.lib_name = 'Evans Library'); i iv) None of the above. i : 5/5 i,ii,iii: 1.75/ i,ii or i,iii: 3.25/5 ii: 1.75/ iii: 1.75/ Your answer: i

d) [5 points] Given the following Relational Calculus statement:

{B | ∃B∊Books( ∃A1∊Authors(A1.birth판34r 5 16789 :

∃A7∊Authors( ( A7.birth판34r 5 1678 :

A7.birth판34r 5 A1.birth판34r B.4uthor ; A7.<4=399> :

circle the Relational Algebra expressions that compute the same result.

i) ρρ(A1,(A2( nameσbirthYear > 1920 2 , c, birthYear Author), bp), σ

birthYear > 1920 Author)

π isbn, title ,author (Books ?@author=name2 (σA 2 .birthYear > A 1 .birthYear(A1 × A2)))

ii) Books – ( (Books ?@σbirthYear < 1921 Author)

A (Books ?@σbirthYear < 1920 Author) )

iii) All the above

iv) None of the above

i: 5/ Your answer: i

b. [ sections, each of which is associated with a particular price. The management decides to introduce a 2 points] According to the above ER diagram, Zellerbach Hall is divided by default in three seating more flexible pricing policy, according to which the number of seating sections (and their pricing) may vary. What changes would you recommend to the above ER diagram to accommodate the management’s wish to make more money? Any solution capturing the need to create a separate entity set for the seating sections, which was associated with the Performance entity set via some relationship set, was given full score. c. (^) from the original ER diagram. Fill in the missing details, so that it captures the constraints that the ER [ 4 points] The following DDL SQL statement creates the table to store the “includes” relationship diagram represents. CREATE TABLE includes id INTEGER, ( name VARCHAR(20), composer VARCHAR(20) - PRIMARY FOREIGN KEY(id) REFERENCES Performance -1 KEY (id, name, composer ), - FOREIGN KEY(name, composer) REFERENCES Composition - );

Name_________________________________________

  • 7 -

4. Suppose the following sequence of calls is presented to the Buffer Manager of a database: Buffer Management and Spatial Indexing 1. 2. get(1);get(7); 3. 4. pin(7);get(3); 5. 6. pin(3);get(4); 7. 8. get(5);get(1); 9. 10. get(4);unpin(7); 11. 12. get(3);get(6); 13. 14. pin(6);get(2); 15. 16. get(1);get(1); 17. 18. unpin(3);get(2); 19. 20. get(6);get(2); 21. get(7); The calls above have the following behavior: - get(RID): fetches the record identified by RID from the buffer, potentially retrieving it from - disk as well if it is not already in the buffer.pin(RID): ensures that the record RID stays in the buffer. - unpin(RID): permits the record^ RID^ to be evicted from the buffer. Additionally, assume the following: - The Buffer Manager has 4 buffers A, B, C and D and they are all initially empty. - The calls containing pin(RID) RID. and unpin(RID) cause the Buffer Manager to access the buffer - If there are multiple free buffers for Buffer Manager to choose to assign to an Manager chooses the first free buffer alphabetically (for example, A before B if both are free). RID, Buffer For each of LRU, MRU and CLOCK, please indicate the misses that occur in the table provided below: final buffer contents and the number of buffer LRU Final buffer contents (RIDs) A B C D^ #^ of buffer misses (^27 3 )

MRU A B C D

CLOCK A B C D

1 or 7 7 or 1 3 6

1 point for each of the boxes A,B,C,D. 2 points for each correct no. buffer misses. 2 points free. Note that CLOCK has two possible correct entries for A,B depending upon how clock hand is moved.

Name_________________________________________

  • 8 -

a) [8 points] Fill in the following table: After processing html page: Contents of root index node Maximum number of keys that can be inserted without splitting any nodes ford.html fonefiftey

toyota.html expedition fonefiftey prius

honda.html fonefiftey

wanted.html fonefiftey

b) [4 points] the following queries? Assume a buffer size big enough to hold 1000 nodes After toyota.html is processed, how many disk blocks are accessed to answer each of, and that the buffer is empty at the start of each query. Keywords in query Number of nodes accessed fonefiftey 3 prius 3 a) [4 points] the following queries? Assume a buffer size big enough to hold 1000 nodes After wanted.html is processed, how many disk blocks are accessed to answer each of, and that the buffer is empty at the start of each query. Keywords in query Number of nodes accessed eclass 3

Name_________________________________________

  • 10 -

impala AND fonefiftey 5 2 points (all or nothing) for each correct root block. 1 point for max num keys insertable. 2 points for each keyword query. 2 points free

Name_________________________________________

  • 11 - toyota.html camry prius corolla tacoma wanted.html impala corolla zfour fonefiftey mustang