Monitoring = (Elasticsearch + Logstash + Kibana ) + Kafka * Flink

I used to be regular business developer. Delphi, then Java. Recently working somewhere, where project was not veery interesting I came to the conclusion: I would like to bring a real business value here.

Then Elasticsearch came to my mind…

I was working with it in the previous place, but here, Look! no one using ES. Docker was one of the milestones of this journey.  Now, having dozens of applications running on I cannot imagine size of the effort needed to run each of these apps manually.  Now?  It is a piece of cake.

Using prepared Docker images I was able to build monitoring system, providing historical and real-time analysis.  Let me introduce, what building blocks are there.  I shall also provide with some exemplary docker-compose scripts and config file. Undeniable proof of ease of the configuration.

File and resources are monitored…

Beats are listening.  This is their task.   With their help it is easy to monitor resources like server’s state, files or web applications. Name something, and they will monitor that.  All what I needed could be easily done using beats, like Filebeat and Metricbeat.  This small Go-powered apps are carefully watching what is going on somewhere – input, and loyally reporting that towards the output.  Beats can send data directly to heart of ELK system Elasticsearch or indirectly via Logstash.  Configuring? In Yaml it can be done that way

 

… then parsed…

Messages coming from Filebeat can be parsed using grok expressions. This is more-or-less similar to the well-known regular expressions. Data is being taken as raw text and produces documents, say JSONs. Then documents can be enriched, f.e. with City name basing on GPS coordinates. Now we are ready to go, we may store it or send further. Following you will find exemplary logstash configuration.

This configuration assumes we want to listen to the messages coming directly from Logstash and provide them to Elasticsearch, further on.

… reaching persistance …

The persistance is Elasticsearch. It is not the database in full meaning of the phrase. Elasticsearch is the search engine with Lucene engine underneath. You can easily browse it in near real-time, querying using all riches of full-text search. Running distributed Elasticsearch in docker-ready environment can be easy as this:

.. and fire docker-compose up once.

… browsing in Kibana!

Kibana is just the GUI tool for all of this. If you are not a big fan of REST querying, via Postmans or other tools, just view it. We are humans. While doing one of the Udacity trainings there was a great lesson about Data Visualization. How important it is to present meaningful information in a way easily understood by men.

Real time? No problem

In parallel, the whole system takes it into the data stream. Data stream is produced in the very same Filebeat, which pushes it towards the stream of Kafka platform. Then consumer being Flink application collects messages and puts them into window or time-based buckets, sending it then forward toward the alerting machines. Then some logic can be set up, f.e. when N messages in M seconds occur, alert me.

Summary

Building all of these blocks without Docker-prepared images would not be so easy.  Monitoring systems will gain more and more popularity – in my opinion.  Be watchful.  And care!