tl;dr Riemann is a monitoring system, so it emphasizes liveness over safety.

Riemann is aimed at high-throughput (millions of events/sec/node), partial-harvest event processing, where it is acceptable to trade completeness for throughput at low latencies. For instance, it’s probably fine to drop half of your request latency events on the floor, if you’re calculating a lossy histogram with sampling anyway. It’s also typically acceptable to have nondeterministic behavior with respect to time windows: if one node’s clock is skewed, it’s better to process it “soonish” rather than waiting an unbounded amount of time for it to check in.

There is no synchronization or relationship between events. Events are immutable and have a total order, even though a given server or client may only have a fraction of the relevant events for a system. The events are, in a sense, the transaction log–except that the semantics of those transactions depend on the stream configuration.

Continue reading (1143 words)

Mass distribution of learning material has been around for a few centuries and has yet to replace the process of guided learning. While it’s possible to amass facts and skills from reading and listening, it’s much more difficult to produce complex works of value without feedback on the process.

Doing mathematics isn’t just applying rules and techniques. It’s about knowing how to reason, and writing a proof in a way which communicates your reasoning clearly to others. You can get started by following along with proofs from a lecture, but in order to really ingrain the techniques in your brain, you have to write proofs of things you’ve never encountered before. Someone has to read those proofs, and give feedback on where your reasoning was unclear, incomplete, or flawed. They can suggest a different notation, or a shorter path to the same solution. Good teachers will leave notes: “this is a cool idea you’ve developed here, and it points towards this area of complex analysis we haven’t talked about yet.”

In psychology it’s not enough to memorize a summary text and a smattering of papers. You need to be asked questions. “There’s a critical flaw in this paper’s sampling methodology. Can you find it? How would you improve it?” “What kind of systematic bias can we expect in these results?” If nobody asks those questions, and helps you hone in on the answers, you’ll miss out on half the text. You’ll be unprepared to evaluate the quality of others research–or to design experiments of your own.

Continue reading (276 words)

We got to talking about space warfare last night, and I realized something pretty weird: FTL drives effect massive shifts in velocity.

Almost every FTL spacecraft, in fiction, is capable of moving between planets in different star systems. The ship starts out roughly stationary relative to planet A, and winds up roughly stationary relative to planet B. How fast are A and B moving compared to one another? How fast do stars move?

Proxima Centauri has a radial velocity (relative to the solar system’s center-of-mass) of -21.7 +/- 1.8 km/s. Its proper motion vector is -3.77530 arcsec/year in right ascension, and 0.76933 arcsec/year in declination. At 4.243 light-years away, its proper motion relative to sol is 23.777 km/s. Its total relative velocity to sol is somewhere around 32.19 km/s, which is just a little faster than the velocity of the earth, rotating around the sun.

Continue reading (684 words)

I’ve been doing a lot of performance tuning in Riemann recently, especially in the clients–but I’d like to share a particularly spectacular improvement from yesterday.

The Riemann protocol

Riemann’s TCP protocol is really simple. Send a Msg to the server, receive a response Msg. Messages might include some new events for the server, or a query; and a response might include a boolean acknowledgement or a list of events matching the query. The protocol is ordered; messages on a connection are processed in-order and responses sent in-order. Each Message is serialized using Protocol Buffers. To figure out how large each message is, you read a four-byte length header, then read length bytes, and parse that as a Msg.

Continue reading (1182 words)

I’ve had two observations floating around in my head, looking for a way to connect with each other.

Many “architecture patterns” are scar tissue around the absence of higher-level language features.

and a criterion for choosing languages and designing APIs

Continue reading (2963 words)

I’ve been putting more work into riemann-java-client recently, since it’s definitely the bottleneck in performance testing Riemann itself. The existing RiemannTcpClient and RiemannRetryingTcpClient were threadsafe, but almost fully mutexed; using one essentially serialized all threads behind the client itself. For write-heavy workloads, I wanted to do better.

There are two logical optimizations I can make, in addition to choosing careful data structures, mucking with socket options, etc. The first is to bundle multiple events into a single Message, which the API supports. However, your code may not be structured in a way to efficiently bundle events, so where higher latencies are OK, the client can maintain a buffer of outbound events and flush it regularly.

The second optimization is to take advantage of request pipelining. Riemann’s protocol is simple and synchronous: you send a Message over a TCP connection, and receive exactly one TCP message in response. The existing clients, however, forced you to wait n milliseconds for the message to cross the network, be processed by Riemann, and receive an acknowledgement. We can do better by pipelining requests: sending new requests before waiting for the previous responses, and matching up received messages with their corresponding requests later.

Continue reading (375 words)

Computer languages, like human languages, come in many forms. This post aims to give an overview of the most common programming ideas. It’s meant to be read as one is learning a particular programming language, to help understand your experience in a more general context. I’m writing for conceptual learners, who delight in the underlying structure and rules of a system.

Many of these concepts have varying (and conflicting) names. I’ve tried to include alternates wherever possible, so you can search this post when you run into an unfamiliar word.

Syntax

Continue reading (3096 words)

A good friend of mine from college has started teaching himself to code. He’s hoping to find a job at a Bay Area startup, and asked for some help getting oriented. I started writing a response, and it got a little out of hand. Figure this might be of interest for somebody else on this path. :)

I want to give you a larger context around how this field works–there’s a ton of good documentation on accomplishing specifics, but it’s hard to know how it fits together, sometimes. Might be interesting for you to skim this before we meet tomorrow, so some of the concepts will be familiar.

How software is made

Continue reading (2951 words)

Schadenfreude is a benchmarking tool I’m using to improve Riemann. Here’s a profile generated by the new riemann-bench, comparing a few recent releases in their single-threaded TCP server throughput. These results are dominated by loopback read latency–maxing out at about 8-9 kiloevents/sec. I’ll be using schadenfreude to improve client performance in high-volume and multicore scenarios.

throughput.png

Continue reading (58 words)

I needed a tool to evaluate internal and network benchmarks of Riemann, to ask questions like

  • Is parser function A or B more efficient?
  • How many threads should I allocate to the worker threadpool?
  • How did commit 2556 impact the latency distribution?

In dealing with “realtime” systems it’s often a lot more important to understand the latency distribution rather than a single throughput figure, and for GC reasons you often want to see a time dependence. Basho Bench does this well, but it’s in Erlang which rules out microbenchmarking of Riemann functions (e.g. at the repl). So I’ve hacked together this little thing I’m calling Schadenfreude (from German; “happiness at the misfortune of others”). Sums up how I feel about benchmarks in general.

Continue reading (402 words)

Went digging through the FBI’s Uniform Crime Reports archives to make this chart. Banning “assault rifles” is not going to significantly reduce murders. If you want to fix that problem by regulating firearms, you’ll have to look at handguns.

firearm-murder-by-type.png

Two things to note here: First, all violent crime fell dramatically during the 90s. Second, we’re getting better at treating gunshot victims, so mortality rates have fallen.

Continue reading (70 words)

Inspired by Mark Reid’s post illustrating the bimodal relationship between the density of guns in a population and the number of gun homicides, I’ve created a slightly different plot from the same data, designed to illustrate a slightly… muddier relationship. This is an expanded variant of homicides vs guns, all countries, but scaling firearm homicides by log(10) shows the relationships between low-homicide countries.

library(directlabels)
library(lattice)

guns <- read.table("guns/data/guns.csv", sep="\t", header=TRUE)
deaths <- read.table("guns/data/deaths.csv", sep="\t", header=TRUE)
oecd <- read.table("guns/data/oecd.csv", sep="\t", header=TRUE)

data <- merge(guns, deaths, by="Country")
data$OECD <- data$Country %in% oecd$Country

plot(
  direct.label(
    xyplot(Homicides ~ Guns, data,
           group=Country,
           main="Homicides vs. Guns",
           xlab="Guns per 100 people",
           ylab="Homicides vs 100k people",
           scales=list(y = list(log = 10))),
    "top.points"))

homicides.png

Continue reading (714 words)