In Chapter 1, I asserted that the grammar of Lisp is uniform: every expression is a list, beginning with a verb, and followed by some arguments. Evaluation proceeds from left to right, and every element of the list must be evaluated before evaluating the list itself. Yet we just saw, at the end of Sequences, an expression which seemed to violate these rules.
Clearly, this is not the whole story.
In Chapter 3, we discovered functions as a way to abstract expressions; to rephrase a particular computation with some parts missing. We used functions to transform a single value. But what if we want to apply a function to more than one value at once? What about sequences?
For example, we know that (inc 2)
increments the number 2. What if we wanted to increment every number in the vector [1 2 3]
, producing [2 3 4]
?
We left off last chapter with a question: what are verbs, anyway? When you evaluate (type :mary-poppins)
, what really happens?
user=> (type :mary-poppins)
clojure.lang.Keyword
We’ve learned the basics of Clojure’s syntax and evaluation model. Now we’ll take a tour of the basic nouns in the language.
Types
This guide aims to introduce newcomers and experienced programmers alike to the beauty of functional programming, starting with the simplest building blocks of software. You’ll need a computer, basic proficiency in the command line, a text editor, and an internet connection. By the end of this series, you’ll have a thorough command of the Clojure programming language.
Who is this guide for?
Some folks have asked whether Cassandra or Riak in last-write-wins mode are monotonically consistent, or whether they can guarantee read-your-writes, and so on. This is a fascinating question, and leads to all sorts of interesting properties about clocks and causality.
There are two families of clocks in distributed systems. The first are often termed wall clocks, which correspond roughly to the time obtained by looking at a clock on the wall. Most commonly, a process finds the wall-time clock via gettimeofday(), which is maintained by the operating system using a combination of hardware timers and NTP–a network time synchronization service. On POSIX-compatible systems, this clock returns integers which map to real moments in time via a certain standard, like UTC, POSIX time, or less commonly, TAI or GPS.
Since the Strangeloop talks won’t be available for a few months, I recorded a new version of the talk as a Google Hangout.
Previously on Jepsen, we learned about Kafka’s proposed replication design.
Cassandra is a Dynamo system; like Riak, it divides a hash ring into a several chunks, and keeps N replicas of each chunk on different nodes. It uses tunable quorums, hinted handoff, and active anti-entropy to keep replicas up to date. Unlike the Dynamo paper and some of its peers, Cassandra eschews vector clocks in favor of a pure last-write-wins approach.
In the last Jepsen post, we learned about NuoDB. Now it’s time to switch gears and discuss Kafka. Up next: Cassandra.
Kafka is a messaging system which provides an immutable, linearizable, sharded log of messages. Throughput and storage capacity scale linearly with nodes, and thanks to some impressive engineering tricks, Kafka can push astonishingly high volume through each node; often saturating disk, network, or both. Consumers use Zookeeper to coordinate their reads over the message log, providing efficient at-least-once delivery–and some other nice properties, like replayability.
Previously on Jepsen, we explored Zookeeper. Next up: Kafka.
NuoDB came to my attention through an amazing mailing list thread by the famous database engineer Jim Starkey, in which he argues that he has disproved the CAP theorem:
In this Jepsen post, we’ll explore Zookeeper. Up next: NuoDB.
Update 2019-07-23: @insumity explains that ZooKeeper sync+read is not, in fact, linearizable–there are conditions under which it might return stale reads.