Paper of the Week: The Chubby Lock Service for Loosely-Coupled Distributed Systems
This is part of our “Paper of the Week” series. For more info, check out our introductory blog post.
This week’s paper is The Chubby Lock Service for Loosely-Coupled Distributed Systems. It was written by Mike Burrows at Google and was presented at OSDI'06, USENIX’s Symposium on Operating Systems Design and Implementation.
This week’s paper was submitted by Hacker Schooler Leah Hanson, who said the following:
Chubby is a distributed lock service; it does a lot of the hard parts of building distributed systems and provides its users with a familiar interface (writing files, taking a lock, file permissions). The paper describes it, focusing on the API rather than the implementation details; it is written in a very readable style. There are amusing stories about other Google engineers (their users) using the API incorrectly; in response, they either fix the API or the implementation so that it’s no longer a problem.
Chubby provides locks and files. Distributed systems usually have one master (in a database, the master approves all writes before they’re real); when the master dies, a new machine needs to be elected master. Without Chubby, this is a hard problem and implementations are error prone. With Chubby (which depends on Paxos, the gold standard for this), a machine will try to grab a lock & write it’s name in a file when it wants to elect itself master; if it succeeds, it’s the new master and the other machines will believe it once they read it’s name in the file. Chubby provides an simple interface, but still gives you the correctness of using Paxos (which is very complex to use).
You have to work at Google if you want to see Chubby’s source code, but ZooKeeper is an opensource alternative.
Here’s the abstract from the paper:
We describe our experiences with the Chubby lock service, which is intended to provide coarse-grained locking as well as reliable (though low-volume) storage for a loosely-coupled distributed system. Chubby provides an interface much like a distributed file system with advisory locks, but the design emphasis is on availability and reliability, as opposed to high performance. Many instances of the service have been used for over a year, with several of them each handling a few tens of thousands of clients concurrently. The paper describes the initial design and expected use, compares it with actual use, and explains how the design had to be modified to accommodate the differences.
Read Along
Read Along is a way for you to participate in Paper of the Week. If you want to take part, all you have to do is read the paper, make something small in response (code or prose), and email us a link of what you make by noon Eastern Time next Monday.
Last week’s paper was The Power of Interoperability: Why Objects Are Inevitable. Here are the Read Along submissions:
- Igor Bondarenko wrote a blog post with a nice summary of the paper’s main points.
- Brian Cobb wrote a blog post with a reflection about frameworks and some good related links.
Happy reading!