In SupplyFrame we started to use Riemann as our stream processing framework to do system and application monitoring. Riemann is a lightweight clojure based DSL operates on event streams. The power of Clojure expressiveness gives it ability to encode the whole event handling logic into one config file. Compare to generic stream framework like Esper or Storm, it is cleaner, simpler, but still highly programmable.
For example, If we want to send the mean data of last 10 minutes to graphite and alert sysops if the median of metric is more than 1000, we can express the idea in riemann like this:
Pretty clean, right?
How to install & deploy Riemann
On the homepage of riemann website, there are some prebuilt packages available. You can simply download the
deb package and
$ sudo dpkg -i riemann.0.2.x.deb to install it. The package will install the configuration file into
/etc/riemann/riemann.config and you can use
sudo service riemann start/stop/restart/reload to run or to stop it. If you prefer to use it as an instance instead of a system service, you can clone the project and use leiningen to install the dependencies and compile the project. You can either run Riemann in a screen or use nohup to force it to run as a daemon. If you run Riemann as an instance, remember to assign each Riemann with different ports that it listens to.
Riemann DSL structure
Riemann is designed to operate on flows of events sent over protocol buffers. An event is basically a map of these fields (copied from riemann documentation):
|host||A hostname, e.g. "api1", "foo.com"|
|service||e.g. "API port 8000 reqs/sec"|
|state||Any string less than 255 bytes, e.g. "ok", "warning", "critical"|
|time||The time of the event, in unix epoch seconds|
|tags||Freeform list of strings, e.g. ["rate", "fooproduct", "transient"]|
|metric||A number associated with this event, e.g. the number of reqs/sec.|
|ttl||A floating-point time, in seconds, that this event is considered valid for. Expired states may be removed from the index.|
You can think that the events in Riemann are nouns, and we're going to use verbs, streams and stream functions to process them.
Streams and stream functions
(streams ...) is the magic macro that forms the unique DSL of processing events. The expressions in
(streams ...) are treated as rules that process the same event simultaneously; while the nested s-expressions in
(streams...) will pipe the results from parent s-expression to child s-expressions.
You can tell from the previous example,
(folds/mean ...) and
(folds/median ...) are at the same level, which means they're processing the same event. And the event pipeline handling logic is expressed in nested s-expressions.
How to handle an event manually 1
Every event handler is a function that takes an event or list of events as input. If you want to deal with an event on the most nested s-expression, its easy. An anonymous function will do the job:
For more example of the basic usage of Riemann and standard stream processing functions, you can find it at Riemann's HOWTO page.
How to handle an event manually 2
In most of the use cases, Riemann's built-in stream aggregation functions solve the problems. However, in some rare cases you might also want to write nested stream processing function similar to the built-in functions. Here's how you can do it:
Here's an use case. We want to detect errors and those events which exceeds defined threshold. In most alerting systems, they either notify sysops with all peak events or smoothen the data which is even more error prone. We want something like this: in time period if the ratio of overshoot or error events is more than x, pipe the events to handler functions. Here's how we implemented it in Riemann:
With this function, now we can define our event logic as
For common Riemann tasks and functions, the official Riemann's HOWTO page is a great reference. For Clojure's syntax and semantics, there's a lot of resources and tutorials in the web; Full disclojure video series. is a good start to get familiar with it.