LogHarbour for Developers

Remiges LogHarbour is a centralised log repository which captures log entries from one or more applications and stores them securely in a separate database. This database may be queried by the applications, but its entries are protected against tampering and deletions. LogHarbour is open source and has client libraries in Go and Java. It can be downloaded from Github: see here, here and here.

This blog carries technical articles which will help developers understand how to use LogHarbour and integrate it into the applications they develop.


What is Remiges LogHarbour?

Remiges LogHarbour is a highly scalable and secure open source product which allows applications to maintain all their logs in a central log repository. Applications insert entries into various logs, and the logs have a standard structure which allows data to be retrieved for forensic analysis or history traceback.

LogHarbour uses an ElasticSearch database as its data store. If multiple applications write to a single LogHarbour installation, each application gets a separate index in ElasticSearch. A minimal recommended LogHarbour installation uses three ElasticSearch nodes in cluster mode, sharding the data and also replicating each shard onto the two other nodes, for higher capacity, higher throughput and better fault resiliency.

LogHarbour client libraries are available in Go and Java. There are no plans to add additional language support. The client libraries push out log entries using a Kafka stream. Each client library instance becomes a Kafka producer, and a central service becomes the Kafka consumer, pulling messages out of the stream and writing to ElasticSearch.

LogHarbour supports three types of logs:

  • Debug log: used only by programmers and systems engineers for debugging problems with the application. This type of log entries is turned off in production under normal use.
  • Activity log: used by the application to log each event, each success or error, occurring in the system.
  • Data-change log: used by the application whenever there is any change to any field of any object. This is used to trace back the change history of any datum.

Each log entry in each log has a fixed structure where there are several standard fields with standard formats and semantics, plus a JSON payload. This JSON payload needs to be interpreted as per context. The 20th-century practice of logging plain text lines into syslog creates a lot of overheads for log processors which need to carry a lot of arcane folklore about the formats of these text lines in order to parse them, thus reducing the value of the log data itself. LogHarbour makes this process much easier.

LogHarbour is released under an Apache 2.0 licence.