Application servers generate many different types of data:
These various pieces of data are generally intended to be delivered to a a number of different destinations and audiences.
Getting the data to each of these separate destinations usually requires a specific set of tools and technologies.
We've spoken to a number of folks, both inside Mozilla and at other well known tech companies, and everyone is doing pretty much the same thing: wiring up a number of different open source and paid service provider options into one pseudo-integrated system held together with duct tape and baling wire.
And they're all spending more time, energy, and money on it than they'd like to be.
If everyone is putting in more effort than they'd like into managing and integrating all of these separate services, then maybe it's a sign that the tools themselves should be more tightly integrated.
There are many domain specific differences, but most of these requirements can be described as a need to perform the following steps:
Heka is a tool that hopes to help solve this problem in the general case. It is a framework for building systems which perform these tasks:
Heka achieves its goals (hopefully!) by the use of a plugin
architecture very much
stolen from inspired by
Logstash. There are four
different types of plugins:
Heka inputs receive data from the outside world, in any number of arbitrary ways:
Heka decoders convert raw data received by an input into Message structs that can be processed by filters and outputs. Decoders can convert from various formats:
Heka filters process decoded messages. They can do any required crunching and collating:
After a filter has processed a message, it can perform any of three steps:
Heka outputs write data to or trigger activity in the outside world. They perform tasks such as:
Routing of messages through the Heka system is handled using "message matchers". Filters and outputs use a simple, high performance grammar to specify what messages they are interested in processing or handling.
Type == "counter" && Payload == "1"
Type == "applog" && Logger == "marketplace"
Type == "alert" && (Severity==7 || Payload=="emergency")
Type == "myapp.metric" && Fields[name] == "successes"
The matching grammar supports regular expression matching on message contents, including match group extraction, i.e. filters and outputs can get access to the string data that caused the message routing match.
Type == "applog" && Payload =~ /^MARKER:/
Type == "applog" && Payload =~ (?P<pl>Payload)
But you don't pay a performance penalty for regex or group matching unless it's actually used.
Heka is written in Go, and all Heka plugins can be written in Go. Filter plugins, however, can also be written in a sandboxed scripting language, allowing dynamic loading and security / resource utilization protection. Heka currently provides a working Lua sandbox.
While the basic infrastructure for Heka is in place, the project is just getting started. Here are some features that are either on the road map or are under consideration: