Project #4 | EECS-3421A: Introduction to Database Systems | fall 2015

EECS-3421A
Introduction to Database Systems
York University
Fall 2015

Project #4
The Apt APP:
Refund

Project

In this project, you will write an application program to update the York River Bookseller's Database (YRB DB) and to report overcharges to customers.

York River Booksellers promises in their advertising that customers always get the best price on a book for which they are eligible. (That is, the lowest club offer for that book such that the customer is a member of that club.) However, it has been discovered that customers are occassionally overcharged! For instance, 'Zebulon Zilio' bought 'Transmorgifacation' under club offer 'Basic' for $288.73. However, as a 'AAA' member, he was eligible for the price of $278.73.

York River Booksellers has promised to refund overcharges to its customers. This refund process involves two parts.

Find the differences between the too-high prices and correct prices for each customer for purchases in which they were overcharged. Print a report that prints a line for each customer and purchase deserving a refund stating the customer's name, title and year of the book in the purchase, the number of copies bought (qnty), the price paid, and the best price they should have paid instead.
Update the YRB DB to reflect that the refunds were made. Do this by updating each tuple in the table yrb_purchase which has an overcharge to reflect the lowest price for which the customer was eligible instead, by changing the purchase's club to a club (for which the customer is a member) that yeilds the best price for the customer. (Should there be a tie for "best" club for a given purchase, choose the first one in lexicographical ordering.)

Note that query refund given as an example in Project #3 implements part 1!

Your APP should implement part 2. Either write a JDBC java program to update the YRB database to reflect the refunds as defined above, or just write pure SQL code that accomplishes the same.

If you write a JDBC program for this, call it Refunder.java. Running the program should update and commit the changes to the database.

If instead you write pure SQL for this, put it in a file named refunder. Executing the file by DB2 should update the database. You may assume auto-commit. E.g.,

% db2 -tf refunder

The YRB Database

The database is the very same York River Bookseller's database from Project #3. The same two scripts are provided here for convenience:

yrb-create and
yrb-drop.

Notes

If you use JDBC for this task, you are likely to use the update statement in your code. (You would not have to, if you are going through an updatable cursor and doing things that way.) If you are using pure SQL for this task, you most surely are using the update statement

Quite unfortunately, the update statement of SQL (in DB2, at least) does not allow the use of the with clause. If you are like me, this can be somewhat traumatic. I use with often, and am somewhat addicted to it.

However, not all is lost. Remember, anything that can be done using the with clause can be done as well without it (with the exception of recursive queries). Instead, one can use sub-queries. It is just without with, one might have to repeat sub-queries. Also, highly nested queries can be harder to read and understand.

As an example, consider the task that we want to update the purchase table to replace whatever club offer the customer used with the first club (in alphabetical order) that the customer could have used instead. (That is, the customer is a member of that other club and the club has an offer available for the book.) Note that this is quite similar to the project! (Well, I am meaning to be helpful.) The difference is that the other club need not provide the best possible price. In fact, it could be providing a worse price. But, for this example, we do not care about that.

update yrb_purchase P
    set club = (
            select min(M.club)
                from yrb_member M, yrb_offer O
                where P.cid = M.cid
                    and M.club = O.club
                    and P.title = O.title
                    and P.year = O.year
        )
    where
        club <> (
            select min(M.club)
                from yrb_member M, yrb_offer O
                where P.cid = M.cid
                    and M.club = O.club
                    and P.title = O.title
                    and P.year = O.year
        );

The sub-queries in the set statement and the where clause are the same! However, we did not have any choice but to repeat ourselves. Okay, in this example, we could drop the where clause completely, and the result would be the same. After all, it does not matter if we update a row's club with the same value it already had. However, for other queries, it could matter.

Note also that the sub-queries are correlated. That is, they use range variables from the outer query. In this case, they use P, and thus are with respect to the current Purchase row under consideration. Typically in any non-trivial update that needs sub-queries, these sub-queries will be correlated.

There is another approach, which I think is more elegant. In an update statement, we are updating a given table. The table is specified right after the update keyword; so yrb_purchase above. In SQL, however, (in almost all cases) wherever we can put a table, we can put a sub-query. And that is true for update statements too!

For update, there are restrictions on what that sub-query can be. The manuals say that it must be a full-select statement. In essence, whatever sub-query we put there has to be updateable. It must be logically unambiguous which actual (base) table (and which attributes) need to be updated to accomplish the updates specified by the update statement's logic.

A full-select statement essentially has a single table in its from clause and it selects all that table's attributes. However, it can do some further, interesting things. It can have a where clause to select just particular tuples. It can use nested sub-queries. It can also add new attributes (columns), the values of which can be determined by sub-queries. (The full-select statement is not allowed to add additional rows with respect to its from table.)

update (select P.*,
               (select min(M.club)
                   from yrb_member M, yrb_offer O
                   where P.cid = M.cid
                       and M.club = O.club
                       and P.title = O.title
                       and P.year = O.year
               ) as first
           from yrb_purchase P
       )
    set club = first
    where club <> first;

This is more elegant than our previous solution. We did not have to repeat anything here! Note we are using a sub-query in the select clause for making another attribute (first). As such, the sub-query is only permitted to project a single column. Note that the sub-query is correlated; it refers to the current purchase tuple P being considered.

Yes, this is perfectly legal. It would result in a runtime error if the sub-query ever were to return more than one row. (Here, that is not possible.) It is fine if it returns one or zero rows. If one row is returned, the attribute's value of the row is used. If no rows are returned, a null value is used instead.

What if we wanted to move the where clause of the update to be the where clause of the update's sub-query instead? (In this case, the intended meaning would be the same, so we have no reason to. However, for other update statements, it could be needed.) Just moving it does not work! The parser will complain that first is out of scope in the where. And it is. We can only use range variables that appear in the from.

We can fix this, though. We just nest the sub-query in another query, so first is accessible. The result is still a full select query, so we are okay.

update (select Q.*
           from (select P.*,
                    (select min(M.club)
                        from yrb_member M, yrb_offer O
                        where P.cid = M.cid
                            and M.club = O.club
                            and P.title = O.title
                            and P.year = O.year
                    ) as first
                    from yrb_purchase P
                ) as Q    
           where club <> first
       )
    set club = first;

Deliverables

Due by 11:59pm Sunday 7 December 2015.

Turn in just electronically using PRISM's submit command. If you have done it by a JDBC Java program, call your program Refunder.java. Submit it by

% submit 3421 refunder Refunder.java

Otherwise, if you implemented in pure SQL, call the file containing the relevant SQL statements refunder. Then submit it by

% submit 3421 refunder refunder