In the previous post, I described an approximation of Heroku’s Bamboo routing stack, based on their blog posts. Hacker News, as usual, is outraged that the difficulty of building fast, reliable distributed systems could prevent Heroku from building a magically optimal architecture. Coda Hale quips:
Really enjoying @RapGenius’s latest mix tape, “I Have No Idea How Distributed Systems Work”.
For more on Timelike and routing simulation, check out part 2 of this article: everything fails all the time. There’s also more discussion on Reddit.
RapGenius is upset about Heroku’s routing infrastructure. RapGenius, like many web sites, uses Rails, and Rails is notoriously difficult to operate in a multithreaded environment. Heroku operates at large scale, and made engineering tradeoffs which gave rise to high latencies–latencies with adverse effects on customers. I’d like to explore why Heroku’s Bamboo architecture behaves this way, and help readers reason about their own network infrastructure.
I’m not a big fan of legal documents. I just don’t have the resources or ability to reasonably defend myself from a lawsuit; retaining a lawyer for a dozen hours would literally bankrupt me. Even if I were able to defend myself against legal challenge, standard contracts for software consulting are absurd. Here’s a section I encounter frequently:
Ownership of Work Product. All Work Product (as defined below) and benefits thereof shall immediately and automatically be the sole and absolute property of Company, and Company shall own all Work Product developed pursuant to this Agreement.
“Work Product” means each invention, modification, discovery, design, development, improvement, process, software program, work of authorship, documentation, formula, data, technique, know-how, secret or intellectual property right whatsoever or any interest therein (whether or not patentable or registrable under copyright or similar statutes or subject to analogous protection) that is made, conceived, discovered, or reduced to practice by Contractor (either alone or with others) and that (i) relates to Company’s business or any customer of or supplier to Company or any of the products or services being developed, manufactured or sold by Company or which may be used in relation therewith, (ii) results from the services performed by Contractor for Company or (iii) results from the use of premises or personal property (whether tangible or intangible) owned, leased or contracted for by Company.
Michael Robertson writes:
@Jason @MicahSingleton Biz can pursue profits or racism, but not both. Tech industry is a meritocracy as is all industries in a free market.
I have it pretty good, in America. I’m White, male, young. Grew up with books. With enough food on the table during critical phases of brain development. In a neighborhood composed of people who looked and spoke like me, a neighborhood with a creek, and trees, and street hockey, somewhere safe. Through deterministic happenstance–a confluence of genetics and education and economics and municipal investment in public education and intellectually challenging parents and the right teachers at pivotal moments–I’m good at thinking about a class of problem which too few people are working on, and present market dynamics allow me to do what I love for far more money than I need.
People grant me the authority to speak as is expected of males, with the lack of recognition of my skin color that comes for people of northern European origin, and for my youth I am forgiven all manner of brash and disrespectful rejoinders. I am significantly more likely to be a victim of a murder, and feel constant pressure to be resolute, correct, gruff. I have never worried for my physical safety in the presence of male companions, and think nothing of walking alone at night. As a motorcyclist and as an engineer I am never the odd one out. I can wear comfortable clothes at formal gatherings. I can enter any building freely, and when boarding a bus, folks never rustle and stare at the delay. I feel tremendously self-conscious when surrounded by people of color. My coworkers never comment about how pretty I am. I am never expected to speak for all young, White males.
tl;dr Riemann is a monitoring system, so it emphasizes liveness over safety.
Riemann is aimed at high-throughput (millions of events/sec/node), partial-harvest event processing, where it is acceptable to trade completeness for throughput at low latencies. For instance, it’s probably fine to drop half of your request latency events on the floor, if you’re calculating a lossy histogram with sampling anyway. It’s also typically acceptable to have nondeterministic behavior with respect to time windows: if one node’s clock is skewed, it’s better to process it “soonish” rather than waiting an unbounded amount of time for it to check in.
Mass distribution of learning material has been around for a few centuries and has yet to replace the process of guided learning. While it’s possible to amass facts and skills from reading and listening, it’s much more difficult to produce complex works of value without feedback on the process.
Doing mathematics isn’t just applying rules and techniques. It’s about knowing how to reason, and writing a proof in a way which communicates your reasoning clearly to others. You can get started by following along with proofs from a lecture, but in order to really ingrain the techniques in your brain, you have to write proofs of things you’ve never encountered before. Someone has to read those proofs, and give feedback on where your reasoning was unclear, incomplete, or flawed. They can suggest a different notation, or a shorter path to the same solution. Good teachers will leave notes: “this is a cool idea you’ve developed here, and it points towards this area of complex analysis we haven’t talked about yet.”
We got to talking about space warfare last night, and I realized something pretty weird: FTL drives effect massive shifts in velocity.
Almost every FTL spacecraft, in fiction, is capable of moving between planets in different star systems. The ship starts out roughly stationary relative to planet A, and winds up roughly stationary relative to planet B. How fast are A and B moving compared to one another? How fast do stars move?
I’ve been doing a lot of performance tuning in Riemann recently, especially in the clients–but I’d like to share a particularly spectacular improvement from yesterday.
The Riemann protocol
I’ve had two observations floating around in my head, looking for a way to connect with each other.
Many “architecture patterns” are scar tissue around the absence of higher-level language features.
I’ve been putting more work into riemann-java-client recently, since it’s definitely the bottleneck in performance testing Riemann itself. The existing RiemannTcpClient and RiemannRetryingTcpClient were threadsafe, but almost fully mutexed; using one essentially serialized all threads behind the client itself. For write-heavy workloads, I wanted to do better.
There are two logical optimizations I can make, in addition to choosing careful data structures, mucking with socket options, etc. The first is to bundle multiple events into a single Message, which the API supports. However, your code may not be structured in a way to efficiently bundle events, so where higher latencies are OK, the client can maintain a buffer of outbound events and flush it regularly.
Computer languages, like human languages, come in many forms. This post aims to give an overview of the most common programming ideas. It’s meant to be read as one is learning a particular programming language, to help understand your experience in a more general context. I’m writing for conceptual learners, who delight in the underlying structure and rules of a system.
Many of these concepts have varying (and conflicting) names. I’ve tried to include alternates wherever possible, so you can search this post when you run into an unfamiliar word.