Distributed Systems

Google File System


What is Google File System

  • Google File System is a scalable distributed file system for large distributed data-intensive applications.
  • It is widely deployed within Google as the storage platform for the generation and processing of data
  • The largest cluster of GFS provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients.

Assumptions while building Google File System

  • Component failures are the norm rather than the exception. The file system consists of hundreds or even thousands of storage machines built from inexpensive commodity parts and is accessed by a comparable number of client machines. The quantity and quality of the components virtually guarantee that some are not functional at any given time and some will not recover from their current failures.
  • Constant monitoring, error detection, fault tolerance, and automatic recovery must be integral to the system.
  • Files are huge by traditional standards. Multi-GB files are common. Each file typically contains many application objects such as web documents. we are regularly working with fast growing data sets of many TBs comprising billions of objects, it is unwieldy to manage billions of approximately KB-sized files even when the file system could support it.
  • Design assumptions and parameters such as I/O operation and blocksizes have to be revisited.
  • Most files are mutated by appending new data rather than overwriting existing data. Random writes within a file are practically non-existent.
  • Appending becomes the focus of performance optimization and atomicity guarantees, while caching data blocks in the client loses its appeal.

Google File System Architecture

  • A GFS cluster consists of a single master and multiple chunkservers and is accessed by multiple clients, as shown in Figure. Each of these is typically a commodity Linux machine running a user-level server process.
  • Files are divided into fixed-size chunks. Each chunkis identified by an immutable and globally unique 64 bit chunk handle assigned by the master at the time of chunkcreation. Chunkservers store chunks on local disks as Linux files and read or write chunkdata specified by a chunkhandle and byte range. For reliability, each chunkis replicated on multiple chunkservers.
  • The master maintains all file system metadata. This includes the namespace, access control information, the mapping from files to chunks, and the current locations of chunks
  • The master periodically communicates with each chunkserver in HeartBeat messages to give it instructions and collect its state.
  • GFS client code linked into each application implements the file system API and communicates with the master and chunkservers to read or write data on behalf of the application. Clients interact with the master for metadata operations, but all data-bearing communication goes directly to the chunkservers.
  • Neither the client nor the chunkserver caches file data. Client caches offer little benefit because most applications stream through huge files or have working sets too large to be cached. Not having them simplifies the client and the overall system by eliminating cache coherence issues. (Clients do cache metadata, however.)

What does Google File System Achieve

  • The Google File System demonstrates the qualities essential for supporting large-scale data processing workloads on commodity hardware
  • GFS  treats component failures as the norm rather than the exception, optimize for huge files that are mostly appended to (perhaps concurrently) and then read (usually sequentially), and both extend and relax the standard file system interface to improve the overall system.
  • GFS system provides fault tolerance by constant monitoring, replicating crucial data, and fast and automatic recovery. Chunkreplication allows us to tolerate chunkserver failures. The frequency of these failures motivated a novel online repair mechanism that regularly and transparently repairs the damage and compensates for lost replicas as soon as possible. Additionally, we use checksumming to detect data corruption at the diskor IDE subsystem level, which becomes all too common given the number of disks in the system.
  • GFS design delivers high aggregate throughput to many concurrent readers and writers performing a variety of tasks. We achieve this by separating file system control, which passes through the master, from data transfer, which passes directly between chunkservers and clients. Master involvement in common operations is minimized by a large chunk size and by chunkleases, which delegates authority to primary replicas in data mutations. This makes possible a simple, centralized master that does not become a bottleneck





How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

As you found this post useful...

Follow us on social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments