COSC6117, Winter 2002
COSC6117
Theory of Distributed Computing
Winter 2002
Instructor: Eric Ruppert
Office: Computer Science Building, room 3042
Telephone: (416) 736-2100 ext. 33979
Facsimile: (416) 736-5872
Lectures: Mondays 14:30-16:00 in room 109 of the Farquharson Building
and Wednesdays 14:30-16:00 in room 129 of the Chem & Comp Sci Building
Office Hours: Tuesdays, Thursdays 14:30-15:30 or by appointment (or just drop by my office).
Announcements
For exercise 6, you can assume that processes are have unique,
(and consecutive) ids
1,2,3,...,n. (This is the assumption we have been
commonly making when designing shared-memory algorithms in class.)
The classes on Apr 8 and Apr 10 will be in CCB120.
Due to seminars, class begins at 3:00 on March 11, 18, 25, and at 2:30 on other days.
You can check my record of your exercise marks by typing "courseInfo 6117".
Course Description
Can a given problem be solved in a distributed system? If so, how
efficiently can it be solved? We investigate how the answers to these
questions depend on aspects of the underlying distributed system
including synchrony, fault-tolerance and the means of communication
between processes.
Possible topics include:
- shared-memory and message-passing models of distributed systems,
- I/O automata,
- mutual exclusion,
- agreement problems (consensus, leader-election, Byzantine agreement, approximate agreement),
- broadcast and multicast algorithms,
- impossibility results and lower bounds,
- the consensus hierarchy,
- implementing shared data structures,
- randomization in distributed computing,
- self-stabilization, and
- applications of topology to distributed computing.
Marking scheme
Homework exercises | 80% |
Class presentation | 20% |
Lecture Notes
One student (a "scribe") will take notes in each class and then type them in Latex. The following formatting files are useful for the scribe notes: notes.sty and template.tex.
The references below are intended for students who want to read more about the topics discussed in class. Sometimes the readings might be helpful for the assignments. Sometimes they will extend the ideas covered in lectures.
- Jan 7: Introduction. Impossibility of Two Generals.
- Jan 9:
[Postscript file][Latex file]
A mutual exclusion algorithm and overview of models of distributed computing. References [Dij65] and perhaps [LL90] or [Lyn96].
- Jan 14:
[Postscript file][Latex file]
Complexity measures, types of distributed problems. I/O Automata (briefly). Flooding in a MP system to do broadcasts and to set up spanning tree. References [Lyn96] for I/O automata; [AW98], Sections 2.1-2.3 for broadcasting stuff.
- Jan 16:
[Postscript file][Latex file]
Broadcasting (cont'd): setting up & using spanning trees and a lower bound time to do first broadcast. References [AW98], Sections 2.4, 2.5. See [AGPV90] for a more general version of the lower bound. More on this topic in [Lyn96] Sections 15.3-15.5.
- Jan 21:
[Postscript file][Latex file]
Leader Election in a ring. Reference [AW98], Sections 3.1-3.3.
- Jan 23:
[Postscript file][Latex file]
Minimum Spanning Tree algorithm. Reference [Lyn96], Section 4.4.
Consensus with Byzantine Failures. Reference [AW98, Sec 5.2.3].
- Jan 28:
[Postscript file][Latex file]
Synchronous Consensus with Byzantine Failures. Impossibility results [Lyn96, Sec 6.4,6.5] and algorithm [BG93 or AW98, Sec 5.2.5].
- Jan 30:
[Postscript file][Latex file]
Finishing up algorithm from last day. Algorithm and lower bound for
synchronous consensus with halting failures [AW98], Section 5.1, or [Lyn96], Section 6.7. (The original proof of the lower bound was in [DM90].)
- Feb 4:
[Postscript file]
Continuation of lower bound. (See previous line for references.)
- Feb 6:
[Postscript file]
Shared-memory systems. Linearizability. Implementations. References: [HW90], [AW98], Chapter 9 or [Lyn96], Chapters 9 and 13.1.
- Feb 18:
[Postscript file]
Impossibility of consensus
in (some) asynchronous systems with halting failures [FLP85] or [AW98], Section 5.3, or [Lyn96], Chapter 12.
- Feb 20:
[Postscript file]
The consensus hierarchy [Her91].
- Feb 25:
[Postscript file]
The consensus hierarchy, continued.
- Feb 27:
[Postscript file]
Universality of consensus [Her91] or [AW98], Section 15.3.
- Mar 4:
[Postscript file]
A bit more on consensus hierarchy: robustness. Intro to snapshot objects.
- Mar 6:
[Postscript file]
Snapshot objects [AADGMS93] or [Lyn96], Sections 13.3.1, 13.3.2.;
- Mar 11:
[Postscript file]
Building better registers [IL93], [AW98] Section 10.2.2.
- Mar 13:
[Postscript file]
Building better registers, continued.
- Mar 18: Randomized Leader Election [AW98], section 14.1.
Self-stabilization [Dij74].
- Mar 20: Self-stabilization, cont'd. Topology [HS99], [HR95].
- Mar 25: Topology, cont'd.
- Mar 27: Presentations by Gilles Pouliquen (Metamorphic Robot Chains(!)), Alex Tumanov (distributed virtual environments), Mikhail Fomitchev (space complexity of randomized consensus)
- Apr 1: Presentations by Ying Zou (deadlock handling), Yijun Du (Synchronization using quantum-based scheduling), Andriy Pavlovych (Counting Networks)
- Apr 3: Presentations by Junjie Guo (Lock-Free Linked Lists), Jianliang Chen (Randomized consensus), Sahib Aulakh (consistency conditions).
- Apr 8: Presentations by Qian Wan (Self-stabilizing leader election), Chenchen Xiao (LE with intermittent link failures), Qi Wang (Adaptive algorithms).
- Apr 10: Presentations by Zhihua Wen (Message-passing mutual exclusion), Caroline Wang (clock synchronization), Vladimir Blagojevic (failure detectors)
References
There is no required textbook for the course. However, I shall sometimes recommend readings from books or papers. These references will be listed here, and the list will grow during the term. Accessing some of the links below may require you to be logged into a machine at York, so that you can access the ACM Digital Library, etc.
Books (on reserve in Science Library)
Papers
- [AADGMS93] Yehuda Afek, Hagit Attiya, Danny Dolev, Eli Gafni, Michael Merritt and Nir Shavit. Atomic snapshots of shared memory. J. of the ACM, 40(4), pages 873-890, September 1993.
- [AGPV90] Baruch Awerbuch, Oded Goldreich, David Peleg and Ronen Vainish. A trade-off between information and communication in broadcast protocols. J. of the ACM, 37(2), pages 238-256, April 1990.
- [BG93] P Berman and J. Garay. Cloture votes: n/4-resilient distributed consensus in t+1 rounds. Mathematical Systems Theory 26(1), pages 3-19, 1993.
- [Dij65] E. W. Dijkstra. Solution of a problem in concurrent programming control . Communications of the ACM, 8(9), page 569, September, 1965. A very early paper on mutual exclusion.
- [Dij74] Edsger W. Dijkstra. Self-stabilizing systems in spite of distributed control. Communications of the ACM, 17(4), pages 643-644, November 1974.
- [DM90] Cynthia Dwork and Yoram Moses. Knowledge and common knowledge in a Byzantine environment: Crash failures. Information and Computation, 88(2), pages 156-186, October 1990.
- [FLP85] Michael J. Fischer, Nancy A. Lynch and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2), pages 374-382, April 1985.
- [Her91] Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1), pages 124-149, January 1991.
- [HR95] Maurice Herlihy and Sergio Rajsbaum. Algebraic topology and distributed computing: A primer. In Computer Science Today: Recent Trends and Developments, pages 203-217. Springer, 1995.
- [HS99] Maurice Herlihy and Nir Shavit. The topological structure of asynchronous computability. Journal of the ACM, 46(6), pages 858-923, November 1999.
- [HW90] Maurice P. Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3), pages 463-492, July 1990.
- [IL93] Amos Israeli and Ming Li. Bounded time-stamps. Distributed Computing, 6(4), pages 205-209, July, 1993.
- [LL90] Leslie Lamport and Nancy Lynch. Distributed Computing: Models and methods. In Handbook of Theoretical Computer Science, Volume B, Chapter 18, Elsevier, 1990. (On reserve in Steacie Library.) A good, brief introduction to the area. It describes aspects of distributed models and surveys some important early results.
Web Pages
Exercises
I'll give you approximately one exercise per week. Each will be due two weeks after it's assigned. (The first one is a two-parter, since I didn't give any exercise in week 1.)

Updated March 26, 2002