CSE-4411A
Database Management Systems

York University
Fall 2010
Final Exam Preparation
  Layout

The exam will probably consist of seven questions. I am aiming to make it about one and a half times the size of either midterm. Given the exam duration is three hours, there ought not be time pressure. The type of questions and layout of the exam will be similar the midterms.

exercise
  • E.g., What is the I/O cost of this query plan?
    How many tuples is this query likely to return? (Use assumptions of Chapter 12 / System R to estimate.)
short answer
  • E.g., Which is the most apropriate join algorithm for this query?
analysis
  • E.g., How would we need to modify the 2-pass sort-merge join to accommodate...

 
  Coverage

The exam is cumulative. It covers all the topics we covered in the course. The coverage will be fairly uniform over these topics.

The topics we have really added since the send midterm have been a fuller coverage of Ch.15, the overall query optimizer, and an overview of transactions and transaction management from Ch.16.

  1. The Physical Database
    1. Storage, Memory Management, Data Structures, & Cost Model (Ch.8 & 9)
      1. disk space manager
      2. cost model: how to count I/O's
      3. buffer pool manager
      4. file and page organization
    2. Indexes
      1. types of indexes (Ch.8)
        1. clustered vs. unclustered
        2. alternatives 1, 2, &3 arrangements
        3. matching to search keys
        4. differences between tree and hash-based indexes for useage
      2. Tree Indexes (Ch.10)
        • B+ Trees (emphasis here)
      3. Hash-based Indexes (Ch.11)
        1. extendable
        2. linear
      4. indexes as access paths (Ch.8 & elsewhere)
        1. prefix matching
        2. cost model for using
        3. clustered vs unclustered
        4. index-only plans
        5. index intersection
    3. External Sorting (Ch.13)
  2. The Query Processor
    1. query evaluation, overview (Ch.12)
      1. access paths
      2. cost model & System R
        1. reduction factors
        2. cost estimation
      3. query trees & alternative plans
        1. pipelining
        2. optimization "rules" like pushing selects
    2. evaluating the relational operators (Ch.14)
      1. select & access paths
      2. project with distinct
      3. join
        1. nested loop joins
          • block-nested-loop join
          • index-nested-loop join
        2. sort-merge join
          • naïve: sort first, then merge
          • 2-pass: accomplishes join in two passes
        3. hash join (two pass)
      4. the set operations
      5. aggregation
    3. a relational query optimizer (Ch.15)
      1. translating SQL queries into relational-algebra trees
      2. estimating the cost of a plan
        1. estimating the result sizes
        2. estimating overall cost
      3. equivalences in the relational algebra
      4. enumerating alternative plans
      5. assumptions of System R
  3. Database Management (Ch.16)
    1. ACID properties
    2. serializability
    3. transaction mananagement
    4. two-phase locking
    5. crash recovery

    Only topics covered both in the reading and in class are fair game.

 
 
  Policies

The test will be open-note, open-book. You may bring a calculator, but the exam will be designed so that any math shouldn't need one.

There will be space on the exam packet for writing answers. I will bring extra paper, in case anyone needs it, to attach.

The exam is scheduled for three hours.

The usual rules and regulations of York University regarding final exams are in effect. Please bring identification (a picture ID) with you to the exam.

Parke Godfrey