COSC6117, Winter 2006
COSC6117
Theory of Distributed Computing
Winter 2006
Instructor: Eric Ruppert
Office: Computer Science Building, room 3042
Email: [my last name] @cs.yorku.ca
Telephone: (416) 736-2100 ext. 33979
Facsimile: (416) 736-5872
Lectures: MW 11:30-13:00 in room 125 of the South Ross Building
Office Hours: Mondays and Fridays, 15:30-16:30, or by appointment, or just drop by when I'm around.
The best way to contact me is probably by email. Please use your cs account when sending me email, and start your subject line with "[6117]". Send messages in plain text, without attachments.
Announcements
- March 31: Hint for 9(a): Imagine an execution where nobody fails and there are two groups of f processes and all the messages that go from one group to the other are REALLY slow.
- February 24: Course evaluations will be done during class on Mar 20.
- February 9: I will be away and probably not checking email during reading week, Feb 13-17.
- January 20: Clarification for exercise 1. As in class, each general can be "programmed" differently. Note that the algorithm can depend on the graph. (This is equivalent to assuming that generals know the graph ahead of time.)
- January 6: Follow this link for a memo from Professor van Wijngaarden to all science and engineering students about the search for a new Dean of the faculty.
Course Description
Can a given problem be solved in a distributed system? If so, how
efficiently can it be solved? We investigate how the answers to these
questions depend on aspects of the underlying distributed system
including synchrony, fault-tolerance and the means of communication
between processes.
A tentative list of topics:
- shared-memory and message-passing models of distributed systems,
- mutual exclusion,
- agreement problems (consensus, leader-election, Byzantine agreement, approximate agreement),
- broadcast and multicast algorithms,
- impossibility results and lower bounds,
- the consensus hierarchy,
- implementing shared data structures,
- randomization in distributed computing,
- self-stabilization, and
- applications of topology to distributed computing.
Marking scheme
| Homework exercises | 80% |
| Class presentation | 20% |
The presentation will be a 25-minute talk summarizing the results of a
research paper on the theory of distributed computing that you find in
the literature. Before you start working on this, you should check
with me that the paper you choose is appropriate. You will also have
to hand in a short (~2 pages) written
summary of what you are presenting. Good places to look for
a paper include
PODC
or
DISC
conference proceedings or the journal
Distributed Computing.
This survey paper has a bibliography containing lots of papers that would be suitable to choose.
If you have a topic in mind, I might be able to help you find a
good paper if you come talk to me.
Lectures
These will be filled in as the term progresses.
The references below are intended for students who want to read more about the topics discussed in class. Sometimes the readings might be helpful for the assignments. Sometimes they will extend the ideas covered in lectures.
Instructions for typing up notes.
Download template.tex and
notes.sty. When you are finished, send me the .tex file (plus any figures). I'll edit and post them.
- Jan 4: Introduction. Two Generals.
- Jan 9: Broadcast. Reference: [AW04] sections 2.1-2.3, [APGV90].
- Jan 11: Broadcast and a bit about leader election [AW04] sections 2.4-2.5, [Lyn96] section 4.4.
- Jan 16: Distributed minimum spanning tree algorithm (synchronous version only) [GHS83], [Lyn96] section 4.4. Impossibility of leader election in a ring with no ids [Ang80] or [AW04] section 3.2.
- Jan 18: More on leader election in a ring. [AW04] section 3.3.
- Jan 23: Leader election: lower bound and randomized algorithms. [AW04] sections 3.3, 14.1
- Jan 25: Modelling systems using I/O Automata. [Lyn96] chapter 8.
- Jan 30: Shared-memory systems. Linearizability [HW90]. Consensus problem.
- Feb 1: Wait-free consensus: 2-process algorithm using stack, n-proc alg using compare&swap, impossibility using registers. [FLP85,Her91]
- Feb 6: 3-process consensus cannot be solved using stacks and registers.
Consensus hierarchy. Implementing counter from registers. [Her91]
- Feb 8: Snapshots. [AADGMS93]
- Feb 20: Herlihy's universal construction [Her91].
- Feb 22: Finished universal construction. Byzantine consensus [AW04, Sec 5.2] or [Lyn96, Sec 6.4,6.5].
- Feb 27: Byzantine consensus.
- Mar 1: Synchronous consensus with crash failures: algorithm and time lower bound. [AW04, Sec 5.1]
- Mar 6: Finishing time lower bound for synchronous consensus.
Specification of chat room.
- Mar 8: Clock synchronization in complete graph, no failures [AW04, Sec 6.3.2]
- Mar 13: Clock synchronization [HMM85,BW01]
- Mar 15: Clock synchronization. Renaming [MA94]
- Mar 20: Renaming. [AW04, Section 16.3], [MA94]
- Mar 22: Connection between message-passing systems and shared-memory. [AW04, Sec 10.4]
- Mar 27: Presentations: Jing Yang (Robust wait-free hierarchies), Niloufar Shafiei (Obstruction-free synchronization: Double-ended queues as an example).
- Mar 29: Presentations:
Stuart Maclean (MST construction in O(log log n) communication rounds, Fatima Ramay (Fast Randomized Consensus using Shared Memory)
- Apr 3: Presentations: Babita Sharma (Self-stabilizing systems in spite of distributed control), Yin Yan (Bounded time-stamps) Jaisingh Solanki (Relative Liveness and Behavior Abstraction).
References
There is no required textbook for the course. However, I shall sometimes recommend readings from books or papers. These references will be listed here, and the list will grow during the term. Accessing some of the links below may require you to be logged into a machine at York, so that you can access the ACM Digital Library, etc.
Books (on reserve in Science Library)
Papers
This list is from last year's version of the course, but it gives you an idea of the kinds of topics covered. I'll add and subtract items from the list during the term.
- [AADGMS93] Yehuda Afek, Hagit Attiya, Danny Dolev, Eli Gafni, Michael Merritt and Nir Shavit. Atomic snapshots of shared memory. J. of the ACM, 40(4), pages 873-890, September 1993.
- [Ang80] D. Angluin. Local and global properties in networks of processors. In Proc. 12th ACM Symposium on Theory of Computing, pages 82-93, 1980.
- [ABDPR90] Hagit Attiya, Amotz Bar-Noy, Danny Dolev, David Peleg and Rüdiger Reischuk. Renaming in an asynchronous environment. J. of the ACM, 37(3), pages 524-548, July 1990.
- [AGPV90] Baruch Awerbuch, Oded Goldreich, David Peleg and Ronen Vainish. A trade-off between information and communication in broadcast protocols. J. of the ACM, 37(2), pages 238-256, April 1990.
- [BG93] P. Berman and J. Garay. Cloture votes: n/4-resilient distributed consensus in t+1 rounds. Mathematical Systems Theory 26(1), pages 3-19, 1993.
- [BW01] Saâd Biaz and Jennifer L. Welch. Closed form bounds for clock synchronization under simple uncertainty assumptions. Information Processing Letters, 80, pages 151-157, 2001.
- [Dij65] E. W. Dijkstra. Solution of a problem in concurrent programming control . Communications of the ACM, 8(9), page 569, September, 1965. A very early paper on mutual exclusion.
- [Dij74] Edsger W. Dijkstra. Self-stabilizing systems in spite of distributed control. Communications of the ACM, 17(4), pages 643-644, November 1974.
- [DM90] Cynthia Dwork and Yoram Moses. Knowledge and common knowledge in a Byzantine environment: Crash failures. Information and Computation, 88(2), pages 156-186, October 1990.
- [FR03] Faith Fich and Eric Ruppert. Hundreds of impossibility results for distributed computing. In Distributed Computing, 16(2-3), pages 121-163, 2003.
- [FLP85] Michael J. Fischer, Nancy A. Lynch and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2), pages 374-382, April 1985.
- [GHS83] R. G. Gallager, P. A. Humblet and P. M. Spira. A distributed algorithm for minimum-weight spanning trees. ACM Transactions on Programming Languages and System Sciences, 5(1), pages 66-77, 1983.
- [HMM85] Joseph Y. Halpern, Nimrod Megiddo and Ashfaq A. Munshi. Optimal precision in the presence of uncertainty. Journal of Complexity, 1, pages 170-196, 1985.
- [Her91] Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1), pages 124-149, January 1991.
- [HR95] Maurice Herlihy and Sergio Rajsbaum. Algebraic topology and distributed computing: A primer. In Computer Science Today: Recent Trends and Developments, pages 203-217. Springer, 1995.
- [HS99] Maurice Herlihy and Nir Shavit. The topological structure of asynchronous computability. Journal of the ACM, 46(6), pages 858-923, November 1999.
- [HW90] Maurice P. Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3), pages 463-492, July 1990.
- [Lam04] Leslie Lamport. A new solution of Dijkstra's concurrent programming problem. Communications of the ACM, 18(8), pages 453-455, August 1974.
- [LL90] Leslie Lamport and Nancy Lynch. Distributed Computing: Models and methods. In Handbook of Theoretical Computer Science, Volume B, Chapter 18, Elsevier, 1990. (On reserve in Steacie Library.) A good, brief introduction to the area. It describes aspects of distributed models and surveys some important early results.
- [LM85] Leslie Lamport and P. M. Melliar-Smith. Synchronizing clocks in the presence of faults. J. of the ACM, 32(1), pages 52-78, 1985.
- [MA94] Mark Moir and James H. Anderson. Fast, Long-Lived Renaming. In Proc. of the 8th International Workshop on Distributed Algorithms, pages 141-155, 1994. (There have been several better versions of adaptive renaming algorithms published afterwards, but I just want to cover the most basic version of the splitter algorithm. See Mark Moir's page for some of the improvements.)
- [Tau04] Gadi Taubenfeld. The black-white bakery algorithm and related bounded-space, adaptive, local-spinning and FIFO algorithms. In Distributed Computing, 18th International Conference, pages 56-70, 2004.
Web Pages
Previous versions of this course: Winter 2002, Fall 2003, Fall 2004.
Exercises

Updated March 31, 2006