Building a data pipeline from HDFS to ElasticSearch using Kafka and Logstash
Logstash has no input plugin for HDFS as you can see here and it cannot load data directly from HDFS to Elasticsearch. In a previous post I showed one way to bypass this limitation by using Hive. This time we will see another way, using Kafka. I tried this process on two versions of Kafka: Apache kafka… Read More »