Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

CS 186 Midterm II: Database Systems - SQL Queries and Query Optimization, Exams of Introduction to Database Management Systems

The cs 186 midterm ii exam from the university of california, berkeley's computer science division - electrical engineering and computer sciences (eecs) department, focusing on introduction to database systems. The exam covers various sql queries, join cost calculations, and query optimization techniques.

Typology: Exams

2012/2013

Uploaded on 04/02/2013

shalin_p01ic
shalin_p01ic ๐Ÿ‡ฎ๐Ÿ‡ณ

4

(7)

86 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
University of California, Berkeley
College of Engineering
Computer Science Division โ€“ EECS
Fall 2004 Prof. Michael J. Franklin
MIDTERM II
CS 186 Introduction to Database Systems
NAME: ___D.B. Guru_____________________ STUDENT ID:__________________
IMPORTANT:
Circle the last two letters of your class account:
cs186 a b c d e f g h i j k l m n o p q r s t u v w x y z
a b c d e f g h i j k l m n o p q r s t u v w x y z
DISCUSSION SECTION DAY & TIME:____________ TA NAME: ___________
This is a closed book examination โ€“ but you are allowed one 8.5โ€ x 11โ€ sheet of notes (double
sided). You should answer as many questions as possible. Partial credit will be given where
appropriate. There are 100 points in all. You should read all of the questions before starting the
exam, as some of the questions are substantially more time-consuming than others.
Write all of your answers directly on this paper. Be sure to clearly indicate your final answer
for each question. Also, be sure to state any assumptions that you are making in your answers.
GOOD LUCK!!!
Problem Possible Score
1. SQL 27
2. Join Cost Calculations 9
3. Query Optimization 24
4. ER Diagrams 25
5. Functional Dependencies 15
TOTAL 100
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download CS 186 Midterm II: Database Systems - SQL Queries and Query Optimization and more Exams Introduction to Database Management Systems in PDF only on Docsity!

University of California, Berkeley College of Engineering Computer Science Division โ€“ EECS

Fall 2004 Prof. Michael J. Franklin

MIDTERM II CS 186 Introduction to Database Systems

NAME: ___D.B. Guru_____________________ STUDENT ID:__________________

IMPORTANT: Circle the last two letters of your class account:

cs186 a b c d e f g h i j k l m n o p q r s t u v w x y z a b c d e f g h i j k l m n o p q r s t u v w x y z

DISCUSSION SECTION DAY & TIME:____________ TA NAME: ___________

This is a closed book examination โ€“ but you are allowed one 8.5โ€ x 11โ€ sheet of notes (double sided). You should answer as many questions as possible. Partial credit will be given where appropriate. There are 100 points in all. You should read all of the questions before starting the exam, as some of the questions are substantially more time-consuming than others.

Write all of your answers directly on this paper. Be sure to clearly indicate your final answer for each question. Also, be sure to state any assumptions that you are making in your answers.

GOOD LUCK!!!

Problem Possible Score

  1. SQL 27
  2. Join Cost Calculations 9
  3. Query Optimization 24
  4. ER Diagrams 25
  5. Functional Dependencies 15

TOTAL 100

Question 1 โ€“SQL [6 parts, 27 points total]

For parts a-d, consider the following schema (primary keys are underlined):

Student (sname, sid, gpa, level, deptno) Course (cno, cname, deptno, units) Dept (dname, deptno) Takes (sid, cno)

a) [7 points] Write a SQL query that returns the names (i.e., snames) of students who have taken more courses outside their department than inside their department. For this question, you can assume that all students in the database have taken at least one course inside their department. (note: you should do scratch work elsewhere and just put your final answer here!)

b) [3 points] Which of the following queries returns the department numbers of those departments for which there are no courses being offered? More than one choice may be correct.

A) SELECT D.deptno FROM Dept D, Course C WHERE D.deptno NOT EQUAL C.deptno;

B) SELECT C.deptno, COUNT(C.deptno) FROM Course C GROUP BY C.deptno HAVING COUNT (C.Deptno) = NULL;

C) SELECT C.deptno FROM Course C WHERE C.deptno NOT IN (SELECT * FROM Dept);

D) SELECT D.deptno FROM Dept D WHERE NOT EXISTS (SELECT * FROM Course C WHERE C.deptno = D.deptno); E) None of the above

c) [3 points] Which of the following queries returns the id of the student with the highest GPA? More than one choice may be correct.

A) SELECT S.sid FROM Students S

e) [3 points] For the following schema: Athletes(name, country, sport, age, height, weight)

Which of the following SQL queries reflects the English query statement: "For each country, find the average height of weightlifters, qualifying only those countries that have weightlifters with minimum weight of 160 pounds." More than one choice may be correct.

A) SELECT country, avg(height) FROM Atheletes WHERE sport = "weightlifting" GROUP BY country, height, weight HAVING min(weight) >= 160;

B) SELECT country, avg(height) FROM Atheletes GROUP BY country, sport HAVING min(weight) >= 160 AND sport = "weightlifting";

C) SELECT country, avg(height) FROM Atheletes WHERE sport ="weightlifting" GROUP BY country HAVING min(weight) >= 160;

D) SELECT country, avg(height) FROM Atheletes WHERE sport = "weightlifting" AND min(weight) >= 160 GROUP BY country, weight;

E) SELECT country, avg(height) FROM Atheletes WHERE sport ="weightlifting" GROUP BY country, height HAVING min(weight) >= 160;

f) [8 points] For the schema in part (e), write a SQL query that returns for each sport, the name of the sport, the country that has the most athletes who play that sport, and the number of athletes of that country that play that sport. (note: you should do scratch work elsewhere and just put your final answer here!)

Question 2 โ€“ Join Costs [3parts, 9 points total]

For this question, you will consider the I/O cost of operations on two tables of a database. The database has the following schema (note, this schema is slightly different than the schema used in question 1).

Students(sid, name, address, GPA) EnrolledIn(sid, classid, semester, year)

Assume that tuples are of fixed size. There are 10 Students tuples per page and 200 EnrolledIn tuples per page. Also assume that there are 1000 pages in the Students relation, and 500 pages in the EnrolledIn relation. The data is unsorted, and tuples are distributed evenly throughout the database.

For the following join strategies, give the I/O cost. Assume 52 pages in the buffer, and that no pages are currently in the buffer when the join begins. If multiple variants of an algorithm have been discussed in class, section, or in the book, use the most efficient one unless otherwise noted. Be sure to state any assumptions you are making and be sure to clearly indicate your final answer.

a) [3 points] Hash Join (not hybrid):

b) [3 points] Sort Merge (note, both relations can be sorted in two passes):

c) [3 points] Block Nested Loops:

Question 3 โ€“ Query Optimization (continued)

b) [4 points] What is the expected cardinality of the Accident relation after initial selections are applied:

Next, estimate the I/O cost for the following access plans in Pass 1. Be sure to list any assumptions you are making (for example if you are sorting RIDs before accessing data).

c) [3points] Index scan on Car(company) for the relation Car:

d) [3 points] Index scan on Accident(accident_date) for the relation Accident:

e) [3 points] Index scan on Accident(damage_amount) for the relation Accident:

f) [3 points] List all join orders that will be considered in Pass 3 by the System R query optimizer. (ignore the specific join algorithm in this step).

g) [6 points] Suppose page nested loop join algorithm is the only available join algorithm, what is the best join order and what is the total estimated cost?

Question 4 โ€“ ER models (continued)

a) [10 points] Create a relational schema (with SQL CREATE TABLE statements) for this diagram. Be sure to label all primary and foreign key constraints. The types of the attributes are as follows:

INTEGER: GuitarType.numFrets, Guitar.serial#, Player.ssn CHAR(20): all others

Note: Your schema for this part should have no more than four tables (solutions with more than four tables will not receive full credit). NOTE: the solution for question 4 is a sample solution; some minor variations on this could still give you full credit.

Question 4 โ€“ ER models (continued)

b) [2 points] Say that we want to impose an additional constraint that a guitar can be played by at most one player. Indicate this new constraint on the original diagram for this question (Be sure that the change is clearly indicated, there will be no regarding on this part).

See arrow in dashed ellipse in diagram.

c) [4 points] For the change in part b, indicate (below) how your CREATE TABLE statements would have to be changed to reflect this.

d) [4 points] Now, say that we want to add the fact that some players are teachers that teach other players. Teachers can teach multiple players and players can have multiple teachers. Indicate this new information on the original diagram for this question (Be sure that the change is clearly indicated, there will be no regarding on this part).

e) [5 points] Write the CREATE TABLE statement(s) that capture this information. Note, this may cause you to change one or more of your original CREATE TABLE statements.