Category Archives: Logstash

Building a data pipeline from HDFS to ElasticSearch using Kafka and Logstash

Logstash has no input plugin for  HDFS as you can see here and it cannot load data directly from HDFS to Elasticsearch. In a previous post I showed one way to bypass this limitation by using Hive. This time we will see another way, using Kafka. I tried this process on two versions of Kafka: Apache kafka… Read More »

Checking and fixing lohstash configuration file

This is a short post that may save you some headache. Sometimes you change the logstash configuration file and after that logstash won’t start and you can see errors like this in the log: {:timestamp=>”2016-08-31T13:10:40.251000+0300″, :message=>”fetched an invalid config”, :config=>”input {\n  kafka {\n\tconsumer_threads => 1\n\ttopic_id => “elastic”\n\tzk_connect => “192.168.56.101:2181″\n  }\n}\noutput {\nelasticsearch {\naction => “index”\nhosts =>… Read More »

Using Logstash to load data from relational database into Elasticsearch

Logstash is an extremely versatile tool for loading data into Elasticsearch. It has many plugins that can interact with almost every kind of system. Last time I showed how to download and install LogStash, and how to load data from CSV files into Elasticsearch. This time we will see how to load data from a… Read More »

Using Logstash to load csv file into Elasticsearch

Logstash is a great tool offered by Elasticsearch itself for transferring data between Elasticsearch and various other sources/targets. It uses plugin technology so it is very versatile and except the official plugins there are many 3rd party plugins that fill the gap and cover almost every existing technology. You can find some information about available plugins… Read More »