Monday, September 19, 2011

Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

The idea behind CAP is simple and the proof is straight forward to understand. I think the importance insight is given by Brewer himself that BASE and ACID is a spectrum. Different services fall onto a different point in this spectrum. For example, money transactions ideally should be ACID while most other internet based serviced can tolerate temporary inconsistency with clever designs. It would be nice to design a infrastructure that allows developers to tune the amount of inconsistency they can tolerate with parameters.


Cluster-Based Scalable Network Services

This paper gives an overview of one architecture for cluster-based network services. The paper is outdated but many ideas persisted to date:
1. Data semantics (BASE vs. ACID): many network services are willing to sacrifice temporary consistency for higher availability. And it's fine for most of them to give an approximate answer for some queries.
2. Scalability: Replicate components or prove non-replicable components are not bottlenecks.
3. Soft State: this is yet another way to improve availability. Soft state doesn't persist in data store. It is computed based on peer communications. This is still popular today in systems like Amazon Dynamo.
4. Layered architecture that helps developer to focus on the "content" of network services: TACC (Transformation, aggregation, caching, and customization) is essentially Google's MapReduce Framework. SNS is essential Google cluster's software infrastructures like GFS and Chubby.