Visualizing Kafka with Cloudera messaging manager (SMM)

By | May 21, 2020

Streams messaging manager (SMM) is not new to HortonWorks users, but since I was mostly using Cloudera I never had the opportunity to use it.

Following the merger of Cloudera and Hortonworks in early 2019, many good products that were originally part of HDP finally made their way into the Cloudera platform including SMM.

Cloudera’s data movement platform is called Cloudera DataFlow, or CDF. One of it’s components is streams management which consists of two services: Streams messaging manager and Stream replication manager. This time I will look into SMM and we will cover SRM in another post.

With CDF Cloudera adopts a more modular approach. Instead of bundling everything into one big cdh parcel there are now many small additional parcels that can be downloaded separately and added to the cdh cluster if needed, using the csd mechanism.

The downside of this new approach is that you do not have everything installed in advance as in cdh and you have to go through the installation -> distribution ->  activation process over and over again. The up side is that you can now upgrade individual components without upgrading the whole cdh.

Long ago, I wrote a post about free visual monitoring tools for Kafka. That post introduced some free graphical tools for managing Kafka. SMM is not free but it looks much better and enables visually controlling and monitoring Kafka while completely integrated with Cloudera Manager.

Installation

As mentioned earlier, each part of CDF is now a different parcel. So you have to download Schema registry and Schema messaging manager. Each product has a parcel and sha1 file along with a csd (custom service descriptor) file.

The basic installation procedure for a csd extension is as follows:

  • Copy the csd file (it is actually a jar file) to /opt/cloudera/csd in your Cloudera manager host.
  • Copy the parcel and the sha1 files to /opt/cloudera/parcel-repo in your Cloudera manager host.
  • Restart cloudera manager and Cloudera management services.
  • Now go to parcel management in Cloudera manager. The new parcel should appear there.
  • Distribute and activate it.
  • Now go to the cluster where you want to install the new parcel, and choose “add service”.
  • Choose the new service and fill any required configurations.
  • Start the new service (if not already started).

 

Apart from this generic process, SMM requires some extra steps. First, you should have Schema registry already installed (follow the generic process shown above) and of course you should have a running Kafka cluster.

Then, we should install node.js and the forever package on the node that will host SMM:

yum install nodejs

npm install forever -g

Obtain the name of the Kafka service (The default name if you didn’t change it during installation is “kafka”).

On one of Kafka’s charts, click the settings icon and choose “open in chart builder:

Then, under the query text on the upper left you will see the kafka service name:

 

In kafka configuration, Ensure that the Enable Producer Metrics check box is selected.

SMM operation requires some extra memory from service monitor and Cloudera manager, so you should monitor the memory consumption of those two services and increase it’s size if necessary.

Using SMM

SMM offers wealth of monitoring information with the ability to create or edit topics. We will not be able to cover every possible option so you will have to install SMM and explore it to see more details.

SMM consists of two processes – a web UI server and a REST admin server. The rest admin server offers a rich API that is documented with swagger, so you can use the information in your own applications.

But what we are looking for is the web UI. It has many options to dig into and find lots of information, much more than we can cover in detail here. To investigate it further I suggest installing it on a test system and experimenting.

The UI server main page shows a nice dashboard with different stats showing the topics and producers, how many bytes and messages are coming in and how many are going out of the topics and so on. The green theme shows this product’s origin:

Besides the overview page, the icons menu to the left also offers brokers page, topics page, producers and consumers pages and an alerts page. Most of the objects can be clicked on to see more details.

The brokers menu sows all the Kafka brokers with some stats like throughput and number of partitions . If you click on a broker you can see all the partitions running on it:

Next in the menu is the topics page showing how much data goes through each topic.

Clicking on the green “Add new” button enables you to easily create topics with all the possible customization:

 

 

If you click on the down arrow (circled in blue) you can view the topic’s partitions as shown below:

If you click the profile icon (circled in red), you get to a very interesting display that has four tabs where you can see topic metrics, configuration and even the messages themselves (the image below shows the data explorer that can show messages between specific offsets):

Another very welcome feature is the ability to define alerts for certain conditions and send notifications as soon as an alert is fired.

In the alerts page, you can add an alert policy where you set the condition when the alert will fire. You can select the attribute you want to monitor from a preset list depending on the component you want to monitor (topic, consumer etc.). Once the policies are set an alert will be raised whenever the condition is met:

You can also define notifiers and connect them to alerts. This way you can receive email alerts whenever something goes wrong.

Email is the only notification method supported right now. It will be nice to have some more options there (like smtp or http).

Conclusion

Streams messaging manager is a very good Kafka management and monitoring tool. It looks better and feels more solid and mature than most of the free solutions I experienced.

It looks simple but it gives you everything you need to manage and monitor your Kafka cluster without ever having to work with the command line scripts. Having it in Cloudera manager enables you to control everything from one place. This is my Kafka tool of choice until further notice.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.