I’ve been doing a lot of performance tuning in Riemann recently, especially in the clients–but I’d like to share a particularly spectacular improvement from yesterday.

Riemann’s TCP protocol is really simple. Send a Msg to the server, receive a response Msg. Messages might include some new events for the server, or a query; and a response might include a boolean acknowledgement or a list of events matching the query. The protocol is ordered; messages on a connection are processed in-order and responses sent in-order. Each Message is serialized using Protocol Buffers. To figure out how large each message is, you read a four-byte length header, then read length bytes, and parse that as a Msg.

time ---> send: [length1][msg1] [length2][msg2] recv: [length1][msg1] [length2][msg2]

Continue reading (1182 words)

I’ve had two observations floating around in my head, looking for a way to connect with each other.

Many “architecture patterns” are scar tissue around the absence of higher-level language features.

and a criterion for choosing languages and designing APIs

Continue reading (2963 words)

I’ve been putting more work into riemann-java-client recently, since it’s definitely the bottleneck in performance testing Riemann itself. The existing RiemannTcpClient and RiemannRetryingTcpClient were threadsafe, but almost fully mutexed; using one essentially serialized all threads behind the client itself. For write-heavy workloads, I wanted to do better.

There are two logical optimizations I can make, in addition to choosing careful data structures, mucking with socket options, etc. The first is to bundle multiple events into a single Message, which the API supports. However, your code may not be structured in a way to efficiently bundle events, so where higher latencies are OK, the client can maintain a buffer of outbound events and flush it regularly.

The second optimization is to take advantage of request pipelining. Riemann’s protocol is simple and synchronous: you send a Message over a TCP connection, and receive exactly one TCP message in response. The existing clients, however, forced you to wait n milliseconds for the message to cross the network, be processed by Riemann, and receive an acknowledgement. We can do better by pipelining requests: sending new requests before waiting for the previous responses, and matching up received messages with their corresponding requests later.

Continue reading (375 words)

Computer languages, like human languages, come in many forms. This post aims to give an overview of the most common programming ideas. It’s meant to be read as one is learning a particular programming language, to help understand your experience in a more general context. I’m writing for conceptual learners, who delight in the underlying structure and rules of a system.

Many of these concepts have varying (and conflicting) names. I’ve tried to include alternates wherever possible, so you can search this post when you run into an unfamiliar word.

Every program has two readers: the computer, and the human. Your job is to communicate clearly to both. Programs are a bit like poetry in that regard–there can be rules about the rhythm of words, how punctuation works, do adjectives precede nouns, and so forth.

Continue reading (3096 words)

A good friend of mine from college has started teaching himself to code. He’s hoping to find a job at a Bay Area startup, and asked for some help getting oriented. I started writing a response, and it got a little out of hand. Figure this might be of interest for somebody else on this path. :)

I want to give you a larger context around how this field works–there’s a ton of good documentation on accomplishing specifics, but it’s hard to know how it fits together, sometimes. Might be interesting for you to skim this before we meet tomorrow, so some of the concepts will be familiar.

There are two big spheres of “technical” activity, generally referred to as “development” and “operations”. Development is about writing and refining software, and operations is about publishing and running it. In general, I think development is a little more about fluid intelligence and language, and ops is more about having broad experience and integrating disparate pieces.

Continue reading (2951 words)

Schadenfreude is a benchmarking tool I'm using to improve Riemann. Here's a profile generated by the new riemann-bench, comparing a few recent releases in their single-threaded TCP server throughput. These results are dominated by loopback read latency–maxing out at about 8-9 kiloevents/sec. I'll be using schadenfreude to improve client performance in high-volume and multicore scenarios.

throughput.png

I needed a tool to evaluate internal and network benchmarks of Riemann, to ask questions like

  • Is parser function A or B more efficient?
  • How many threads should I allocate to the worker threadpool?
  • How did commit 2556 impact the latency distribution?

In dealing with “realtime” systems it’s often a lot more important to understand the latency distribution rather than a single throughput figure, and for GC reasons you often want to see a time dependence. Basho Bench does this well, but it’s in Erlang which rules out microbenchmarking of Riemann functions (e.g. at the repl). So I’ve hacked together this little thing I’m calling Schadenfreude (from German; “happiness at the misfortune of others”). Sums up how I feel about benchmarks in general.

Continue reading (402 words)

Went digging through the FBI’s Uniform Crime Reports archives to make this chart. Banning “assault rifles” is not going to significantly reduce murders. If you want to fix that problem by regulating firearms, you’ll have to look at handguns.

firearm-murder-by-type.png

Two things to note here: First, all violent crime fell dramatically during the 90s. Second, we’re getting better at treating gunshot victims, so mortality rates have fallen.

Continue reading (70 words)

Inspired by Mark Reid’s post illustrating the bimodal relationship between the density of guns in a population and the number of gun homicides, I’ve created a slightly different plot from the same data, designed to illustrate a slightly… muddier relationship. This is an expanded variant of homicides vs guns, all countries, but scaling firearm homicides by log(10) shows the relationships between low-homicide countries.

library(directlabels) library(lattice) guns <- read.table("guns/data/guns.csv", sep="\t", header=TRUE) deaths <- read.table("guns/data/deaths.csv", sep="\t", header=TRUE) oecd <- read.table("guns/data/oecd.csv", sep="\t", header=TRUE) data <- merge(guns, deaths, by="Country") data$OECD <- data$Country %in% oecd$Country plot( direct.label( xyplot(Homicides ~ Guns, data, group=Country, main="Homicides vs. Guns", xlab="Guns per 100 people", ylab="Homicides vs 100k people", scales=list(y = list(log = 10))), "top.points"))

homicides.png

Continue reading (714 words)

In the just-released riemann-java-client 0.0.6, riemann-clojure-client 0.0.6, riemann-ruby-client 0.0.8, and the upcoming riemann 0.1.4 (presently in master), Riemann will support two new types of metrics for events.

https://github.com/aphyr/riemann-java-client/blob/master/src/main/proto/riemann/proto.proto#L24

# signed 64-bit integers (variable-width-encoded) optional sint64 metric_sint64 = 13; # double-precision IEEE 754 floats optional double metric_d = 14;

Continue reading (261 words)

Nassim Nicholas Taleb has written a piece on futurism which is making me feel, well, contradictory. Apologies for my writing: fighting a killer headache this week.

Taleb asserts that the present has changed little from the past; that “futurists always get it wrong”, and that if we wish to envision the future we should subtract from the present things which do not belong. I believe the present is so different from the past that it would be shocking to humans from even a few centuries ago. Technology is culture, and our immersion in culture makes it quite difficult to understand just how unusual we are.

My great-grandparents were immigrant farmers. Most people were: prior to industrialization in the late 1700s, the vast majority of humans grew their own food or were engaged in providing it to others. Now worldwide, only a third of our workers grow food. The US’s agricultural output has almost tripled over the past sixty years, a result of phenomenal improvements in efficiency made possible by the widespread use of petrochemicals–an energy-dense store only accessible for the past few hundred years. That same industrial revolution cut the fraction of our population employed in agriculture from 75% to only 3%.

Continue reading (1416 words)

When Tyler and I rented this apartment together, we knew we wanted a table. Our common room has a linear kitchen at one end and the couch & coffee table at the other. Our plan (and in concordance with FARMHOUSE KITCHEN, DIFFERENT CHAIRS, and HALF-FILLED WALL) was to divide the two spaces with the dining table–and to get some extra counter and storage space. With tons of natural light, white walls, and blond flooring, we knew we wanted a solid, darker piece to balance the room–something with rough, warm materials. It also needed to be unusually high, to provide a standing work surface. After rejecting a few expensive and ill-sized pieces from craigslist, catalogs and furniture stores, we decided to build one ourselves.

I saw my first live-edge table freshman year at Carleton; an acquaintance had completed one as a part of their woodworking study, and invited a small group to dinner to celebrate. Oak, I believe–roughly eight feet by 40 inches, a beautiful pair of book-matched slabs cleaved perfectly from bark to core, and polished to a fine sheen. The top rest on legs only 18 inches in height; we sat on the floor or cushions. For seven years the feeling of that table resonated in my memory. Now I had the workshop and living space to make one myself.

I should mention that I’m an amateur woodworker at best. My grandpa was a fine craftsman in furniture and instruments, and my father built most of the furniture in our house, but my contact with the art is limited to helping my dad build storage lofts, beds, and the like, and small projects by myself. I have a good intuitive sense of space and color, and of force geometry; a beginner’s understanding of wood grain and technique; a few basic tools; an ample supply of patience; and a willingness to learn from books, lore, and by experiment. In many aspects of this design I chose the path which was cheaper, or accessible with limited tools, or allowed more tolerance for error. There’s a huge gap in design and execution between this table and a professional piece. And still: I wouldn’t give up the hundred-odd hours I invested for a finer piece built by someone else.

Continue reading (2809 words)

Copyright © 2018 Kyle Kingsbury.
Non-commercial re-use with attribution encouraged; all other rights reserved.
Comments are the property of respective posters.