I’ve been focusing on Riemann client libraries and optimizations recently, both at Boundary and on my own time.

Boundary uses the JVM extensively, and takes advantage of Coda Hale’s Metrics. For our applications I’ve written a Riemann Java UDP and TCP client, which also includes a Metrics reporter. The Metrics reporter (I’ll be submitting that to metrics-contrib later) will just send periodic events for each of the metrics in a registry, and optionally some VM statistics as well. It can prefix each service, filter with predicates, and has been reporting for two of our production systems for about a week now.

The Java client has been integrated into Riemann itself, replacing the old Aleph client. It’s about on par with the old Aleph client, owing to its use of standard Socket and friends as opposed to Netty. Mårten Gustafson and Edward Ribeiro have been instrumental in getting the Java client up and running, so my sincere thanks go out to both of them.

I also removed the last traces of Aleph from riemann.server, replacing the TCP server with a pure Netty implementation. I also replaced Gloss with Netty-provided length header parsers, which cuts down on copying somewhat. Here’s the performance of a single-threaded localhost client which sends an event and receives a OK response:

Aleph Raw Netty
drop tcp events latency.png drop tcp events latency 2.png
drop tcp events throughput.png drop tcp events throughput 2.png

Steady-state throughput with raw Netty is about 2.5 times faster. Median and 95% latency is significantly decreased, though occasional 20ms spikes are still present (I presume due to GC). Please keep in mind these graphs can only be compared with each other; they depend significantly on the hardware and JVM. This also does not represent concurrent performance—I’m trying to optimize the simplest system first before moving up. With that in mind, Riemann’s real-world performance with these changes should be “much faster”.

Next up I’ll be replacing clojure-protobuf with direct use of the Java protobuf classes; as I’m copying data into a standard Map anyway it should be slightly faster and consolidate codepaths between server and client. I’ll also begin type-hinting key sections of the server and message parser to reduce use of reflection.

Post a Comment

Comments are moderated. Links have nofollow. Seriously, spammers, give it a rest.

Please avoid writing anything here unless you're a computer. This is also a trap:

Supports Github-flavored Markdown, including [links](http://foo.com/), *emphasis*, _underline_, `code`, and > blockquotes. Use ```clj on its own line to start an (e.g.) Clojure code block, and ``` to end the block.