|
CSE-3421
Introduction to Database Systems
York University
Winter 2009
|
Project #3:
SQL Jeopardy
Querying the York River Bookseller Database
|
|
|
|
In
this project,
you will work with an existing database,
the York River Bookseller's Database (YRB DB).
Actually,
each will install his or her own private copy of the database
on which to work.
For SQL Jeopardy,
you are to implement a number of SQL queries over the YRB DB.
The project is 10% of the total grade.
There are ten queries that you are to write.
Each is worth one point for a total of ten points for Project #3.
I am the manager and I provide you with English specifications
for the queries that I want answered from the database.
You are providing SQL implementations of these queries for YRB DB
to do the job.
Each query is given a name for bookkeeping.
Also, the answer table for each is provided!
You have successfully implemented the SQL query
if you get the same result when run in DB2 on YRB DB.
(And, of course, you have not cheated by making a simple query on purpose
that manages to print the same results!)
Otherwise, your query is unsuccessful.
Grading per query is all-or-nothing.
Either your query produces the right results, or it does not.
The queries below are arranged roughly in order of increasing difficulty.
The ones at the beginning are straightforward,
and the ones towards the end are difficult.
Do not get too discouraged if you are unable to do the final few.
Honesty:
For projects, you are permitted to confer with others,
seek advice, and (to a reasonable extent) help.
However, remember that copying someone else's queries
and claiming them as your own work is plagiarism.
You must do your own work.
|
|
|
|
Two
scripts are provided:
The script yrb-create will create the YRB DB schema for you.
It will also populate the tables with mock data.
The script yrb-drop is provided for convenience.
It will drop your copy of YRB DB from your DB2 schema space.
If you mess things up,
you can always drop YRB DB and then re-create it easily.
To create the YRB DB in your DB2 schema space:
Read the schema definition in yrb-create for YRB DB
to understand the design and what the YRB DB is about.
York River Booksellers is an online bookstore.
Customers belong to various clubs.
Everybody belongs to at least the club basic.
Books are available via different offers per club.
Thus the price of a book is determined by which offer (thus club)
it was bought under.
Consider an order to be all the books
a customer bought at the same time (when in purchase).
These will be shipped together.
|
|
|
|
1. ancient
Who are the customers who made a purchase before January 1 2002
('2002-1-1')?
Show customer's name and city and the date (not
timestamp!) of purchase. Eliminate duplicates.
Order by name + city + date
Answer table for ancient.
Hint:
The attribute when is of type timestamp.
This is different than type date.
So to use it say in a WHERE condition
to compare it with, say, '2002-1-1'.
you will need to cast it to type date:
cast(when as date).
Likewise, '2002-1-1' is a string!
So for DB2 to consider it as a date,
you must cast it as well:
cast('2002-1-1' as date).
|
2. multiple
Which customers have bought several copies of a book within a purchase?
Show the customer's name, the book's title and year, and how
many copies were purchased.
Order by name + title + year.
Answer table for multiple.
|
3. like
Which books have 'like' or 'Like' in the title?
Show the tile, year, and the book's category.
Order by title + year + cat.
Answer table for like.
Hint:
See the SQL operator like in the textbook
(pp 140-141, 3rd ed.).
|
4. topbuyer
For each club,
who is the customer who has spent the most via that club's offers
on books,
and what is the total that he or she has spent via that club?
Be certain to account for ties.
Two or more customers could tie as top buyer for a given club.
In such cases, report all of the top buyers.
Order by total sales, from highest to lowest, then by customer name.
Answer table for topbuyer.
|
5. pairs
Which pairs of customers have bought at least three books in common?
Print three columns: two with the customers' names and
one with the number of books in common.
Do not return any duplicates.
Furthermore, say 'Mark Dogfurry' and 'Zebulon
Zilio' have four books in common, only output ('Mark Dogfurry',
'Zebulon Zilio', 4) and not ('Zebulon Zilio', 'Mark Dogfurry', 4)!
If 'Mark Dogfurry' and 'Zebulon Zilio' have each bought the same
book three times, this does not count. It has to be at least three
different books.
Order by the names.
For each tuple, be certain that name first (column one)
is alphabetically before name second (column two).
Answer table for pairs.
|
6. allbooks
Which customers have bought all the books offered within
some category / language group, given that category / language group
contains more than one book title?
E.g., Who has bought all the French romance books?
List by customer's name and the Category and language.
Order by name + category + language.
Answer table for allbooks.
|
7. orders
All the books a customer orders at the same time (when) are considered
to be part of the same "order".
Those books are shipped together to the customer and the customer is billed
for the entire order.
What is the bill for each order?
Show the customer's name and city,
the date and time of the order (not the timestamp!), and the bill.
Order by name + city + when.
Answer table for orders.
Hint:
Want to shorten one of your decimal column types so your lines do not
wrap on output?
You can cast it to be smaller.
For instance,
- cast(col_a as decimal(5,2))
This will print decimals (numbers) with 5 numerals to the left of the
"." and 2 to the right.
|
8. weights
What is the total weight of each order?
Order by weight, descending,
and then by customers' names (in cases of ties on weight).
Answer table for weights.
|
9. billing
As in Query orders,
all the books a customer orders at the same time (when) are considered
to be part of the same "order".
Those books are shipped together to the customer and the customer is billed
for the entire order.
What is the bill for each order, with the shipping cost added?
The shipping cost is as follows: The weight of the order is looked up
in the 'yrb_shipping' table. If the weight is X grams, the entry just
higher than (or equal to) X is found in the shipping table and the
associated shipping price is added. For instance, if the order's weight
is 1447 grams, the entry '1500 5.00' is found, and so the cost is $5.00.
Show the customer's name and city, the date and time of the order (not
the timestamp!), the bill without the shipping charge, and the total
bill (with shipping).
Order by name + city + when.
Answer table for billing.
Hint:
This query is an extension of query orders;
it added another column that shows the total price, with shipping.
Also note that you can reuse your query weights
to help figure out the shipping cost for each purchase.
You may find DB2 SQL's
with
clause useful for this query.
(You probably would find with useful
for query percentage too and perhaps others.)
It lets you declare "temporary tables" for use within a query.
So you can store the results of one query for use in another.
See below on this page for an example.
|
10. droppable
A club is droppable
if all the purchases that are in the database
could still have been made by the customers,
just using the remaining clubs instead.
Note that each "reassigned" club per purchase must be a club
for which the purchase's customer is a member.
If there are no other legitimate clubs for reassignment
for some purchase,
the club in question is not droppable.
For each droppable club,
how much more money (or less!)
would YRB have made if that club had never existed?
Assume that a club
"reassignment" for each purchase involving the dropped club replaces it
with a best offer (across the remaining, legitimate clubs) for that customer.
Order by club.
Answer table for droppable.
|
|
|
|
Due by 11:59pm Friday 22 May 2009.
For
each query,
write your query in a file with the corresponding name as above
(ancient,
multiple,
like,
topbuyer,
pairs,
allbooks,
orders,
weights,
billing,
and
droppable).
Do not include the answers generated by each of your queries.
We will be testing your queries ourselves for grading your project.
When you are finished,
use the submit command to turn in your work.
We shall publish soon the submit script that you are to use.
The command will look something like:
- % submit 3421 yrb ancient multiple like topbuyer
pairs allbooks orders weights billing droppable
|
|
|
|
DB2
SQL provides a useful clause called with.
It is easiest to show its use by example.
|
refund
Who has bought a book via some club
that he or she could have gotten less expensively
using another club to which he or she belongs?
List the customer's name, the book's title and year,
the quantity bought, the price for which it was bought,
and the best price for which it could have been bought.
with
best (cid, title, year, lowest) as (
select distinct M.cid, O.title, O.year, min(price)
from yrb_member M, yrb_purchase P, yrb_offer O
where M.club = O.club and
M.cid = P.cid and
P.title = O.title and P.year = O.year
group by M.cid, O.title, O.year
)
select C.name, P.title, P.year, qnty, price, lowest
from yrb_customer C, yrb_purchase P, Best B, yrb_offer O
where P.cid = B.cid and P.title = B.title and
P.year = B.year and P.title = O.title and
P.year = O.year and P.club = O.club and
C.cid = P.cid and
O.price > B.lowest
order by C.name, P.title, P.year;
|
Answer table for query refund.
|
One may have more tables in the with clause, if need be:
with
first (...) as
(...),
second (...) as
(...),
third (...) as
(...)
select ...
...;
|
Note that you can use the (temporary) view first
inside the definition for second,
and the views first and second
inside the definition for third,
and so forth.
DB2 is picky about nested SELECTS always being named
via AS.
For example,
select ...
from (select ...) as A,
(select ...) as B,
...
...;
|
Even if you do not need the alias names A or B
later on, give the sub-queries (nested SELECTS) names anyway.
Otherwise, DB2 will complain.
|
|
|
|
|
|