What is Chubby Lock Service
- Chubby lock service provides coarse-grained locking as well as reliable (though low-volume) storage for a loosely coupled distributed system
- The design emphasis is on availability and reliability, as opposed to high performance
What is the purpose of Chubby Lock Service
- It is intended for use within a loosely-coupled distributed system consisting of moderately large numbers of small machines connected by a high-speed network.
- The purpose of the lock service is to allow its clients to synchronize their activities and to agree on basic information about their environment.
- Chubby helps developers deal with coarse-grained synchronization within their systems, and in particular to deal with the problem of electing a leader from among a set of otherwise equivalent servers.
- The Google File System uses a Chubby lock to appoint a GFS master server, and Bigtable uses Chubby in several ways: to elect a master, to allow the master to discover the servers it controls, and to permit clients to find the master. In addition, both GFS and Bigtable use Chubby as a well-known and available location to store a small amount of meta-data.
Chubby Lock Service System Architecture

- Chubby has two main components that communicate via RPC: a server, and a library that client applications link against. All communication between Chubby clients and the servers is mediated by the client library.
- A Chubby cell consists of a small set of servers (typically five) known as replicas, placed so as to reduce the likelihood of correlated failure (for example, in different racks). The replicas use a distributed consensus protocol to elect a master; the master must obtain votes from a majority of the replicas, plus promises that those replicas will not elect a different master for an interval of a few seconds known as the master lease. The master lease is periodically renewed by the replicas provided the master continues to win a majority of the vote.
- The replicas maintain copies of a simple database, but only the master initiates reads and writes of this database. All other replicas simply copy updates from the master, sent using the consensus protocol.
- Clients find the master by sending master location requests to the replicas listed in the DNS. Non-master replicas respond to such requests by returning the identity of the master.
- Write requests are propagated via the consensus protocol to all replicas; such requests are acknowledged when the write has reached a majority of the replicas in the cell.
- Read requests are satisfied by the master alone; this is safe provided the master lease has not expired, as no other master can possibly exist. If a master fails, the other replicas run the election protocol when their master leases expire; a new master will typically be elected in a few seconds.
- If a replica fails and does not recover for a few hours, a simple replacement system selects a fresh machine from a free pool and starts the lock server binary on it. It then updates the DNS tables, replacing the IP address of the failed replica with that of the new one.
- The current master polls the DNS periodically and eventually notices the change. It then updates the list of the cell’s members in the cell’s database; this list is kept consistent across all the members via the normal replication protocol. In the meantime, the new replica obtains a recent copy of the database from a combination of backups stored on file servers and updates from active replicas. Once the new replica has processed a request that the current master is waiting to commit, the replica is permitted to vote in the elections for new master.
Lessons Learnt while Designing Chubby
- Developers rarely consider availability We find that our developers rarely think about failure probabilities, and are inclined to treat a service like Chubby as though it were always available.
- Developers also fail to appreciate the difference between a service being up, and that service being available to their applications.
- Chubby provides an event that allows clients to detect when a master fail-over has taken place. Unfortunately, many developers chose to crash their applications on receiving this event, thus decreasing the availability of their systems substantially
Chubby Lock Service Summary
- Chubby is a distributed lock service intended for coarse grained synchronization of activities within Google’s distributed systems. It has found wider use as a name service and repository for configuration information.
- Chubby is based on distributed consensus among a few replicas for fault tolerance, consistent client-side caching to reduce server load while retaining simple semantics, timely notification of updates, and a familiar file system interface.
- Chubby has become Google’s primary internal name service; it is a common rendezvous mechanism for systems such as MapReduce, the storage systems GFS and Bigtable use Chubby to elect a primary from redundant replicas; and it is a standard repository for files that require high availability, such as access control lists.