Kafka with Microsoft Azure: Streaming Unlimited Data Into Cloud | HCLTech

Kafka With Microsoft Azure: Streaming Unlimited Data Into Cloud

Kafka With Microsoft Azure: Streaming Unlimited Data Into Cloud
October 11, 2022

Every organization uses heterogeneous sources of data that come in diverse data formats. So, when we plan for data integration, we combine data from these multiple streams of sources into an integrated and unified repository. However, because of their huge interdependence and complex connectivity, these diverse integrations can pose a real and unique challenge for us.

Another complex issue is making Apache Kafka available in an Azure environment as there are a lot of unavoidable scenarios.

Apache Kafka with Azure not only promises us to help decouple these differentiated data streams and systems but also allows us the required Microsoft cloud connectivity, thus amicably solving this complex riddle.

As you go through this blog, you would understand how this Kafka and Azure combination would allow us to simplify the heaviest data workload scenarios. This offers infinite promises and possibilities for heavy data-intensive applications.

Common scenarios where Apache Kafka is useful

Most organizations are increasingly relying on Kafka as a safe bet when integrating a diverse portfolio of applications. The uses of Apache Kafka are huge and diversified in nature. Its importance can also be gauged from the fact that it had been rated 4.4 out of 5 by Gartner, which is outstanding. Some of the common Apache Kafka uses are as follows:

  • Integration of multi-cloud architecture
  • Integration with big data technologies such as Spark and Hadoop
  • Gathering real-time metrics from many different places and locations, for example, IoT devices
  • Diversified messaging systems
  • Real-time activity tracking
  • Application logs analysis
  • De-coupling of system dependencies
  • As an event-sourcing store

HCLTech customer success story: Kafka Microsoft Azure

A US-based, global clothing and accessories retailer

HCLTech had recently resolved a complicated data integration architecture using Kafka with Azure for a well-known American worldwide clothing and accessories retailer. The system which needed to be architected was very complex as it was using various on-premises, complex data marts for order management, inventory management shipments, and package delivery.

The previous system did not provide real-time reporting as the above data marts refreshed only a few times in a day. Also, there was neither integration between different systems involved nor real-time or near real-time visibility over various processes. Therefore, the old legacy system was re-architected using Kafka with Azure EventHub to provide real-time visibility to track order fulfillment. It would also allow customers to explore the overall online demand, fulfillment, and revenue in a real-time scenario. Another reason for Kafka integration with the event hub was to provide access to the Azure ecosystem for the migration of data into the Azure cloud. We also got easy connectivity to other  Azure components such as Azure SQL rdfrftr67server, Stream Analytics, Azure Analysis services, etc.

Solution

Why Kafka on Azure: Various  scenarios

  • Although Kafka has a rich ecosystem, when used alone, it has certain inherent challenges. We must deploy Kafka on our own and do related tasks such as procuring the required hardware, provisioning physical network, security, installing OS, and then installing or configuring it. It does not end with mere deployment; we also need to  manually manage it. When we connect it to the Azure event hub, we can provision clusters, perform load balancing, and leverage the cloud’s robust security model with a single click of a button, making it fully managed.

Easy to use and scale

  • It is possible to simplify Kafka by zero code provisioning with the help of Event Hub, which would also make it fully managed and easy to scale. So, if there is more load on Kafka, unlimited scaling achieved by this integration is going to be very helpful.

Migration to the cloud

  • In the case of a migration scenario, we can easily migrate data into Azure and integrate it with other services such as Service Bus, Functions, etc. using Event Hubs.

Gateway to the cloud

  • We can access all the features of Kafka and Azure ecosystem in combination with Event Hubs. Azure cloud integration will add more features such as elasticity, scalability, and security.

Kafka and Azure connectivity

In the last two years, Microsoft has gradually introduced a diverse set of services that allow us to easily leverage Kafka on the Azure cloud platform. Here is the list of sink connectors to Azure that allows us to directly integrate these high-end Azure services with Kafka.

Azure functions Kafka trigger support

Although Microsoft released different connectors last year, direct connectivity to Microsoft’s serverless offering, particularly Azure functions, was still missing. So in May 2022, Microsoft released the Kafka extension for Azure functions, which has made it easy to discover and react to real-time message streaming into Kafka topics, or write to   the Kafka topic through output binding. We can now concentrate on our Azure function’s core logic without bothering about the event-sourcing pipeline or maintenance of the required infrastructure to host the extension. Please note that this feature is exclusively supported in the premium function plan (functions-bindings-Kafka-trigger).

This allows it to elastically scale and trigger Kafka messages.

Event Hubs’ support architecture for Kafka

Azure Event Hubs has a lot of similarities with Apache Kafka as both primarily support big data streaming platforms as well as event ingestion services. Azure Event Hubs is one of the premium Azure PaaS services with little configuration or management overhead. If we do an architecture-wise comparison (mentioned in the table below), both have a lot of similarities.

Kafka

Event Hubs

Cluster

Namespace

Topic

Event Hub

Partition

Partition

Consumer Group

Consumer Group

Offset

Offset

Kafka endpoint for Event Hubs

One of the namespace's endpoints is compatible with the Apache Kafka producer and consumer APIs, which support version 1.0 and above. So we can use the Event Hubs Kafka endpoint from our applications for which no code changes or code deployment is required apart from the configuration changes.

Kafka on Kubernetes

Kafka's architecture in Azure Kubernetes (AKS) is very appealing for various reasons. If we are planning to standardize the use of Kubernetes as an enterprise container platform, this is a great reason to contemplate running Kafka there as well.

Running Kafka on Kubernetes enables us to easily streamline operations such as updates, restarts, and monitoring that are more or less integrated into the Kubernetes platform. There are only a few additional steps to be followed for doing that.

Conclusion

Apache Kafka with Azure has great recognition across different industries and verticals. It has become the  industry standard in a short time, when it comes to data streaming, building real-time big data pipelines, or even communicating asynchronously with microservices. Here, I attempted to summarize the different features provided by Kafka with Azure (Event Hubs) in general, as well as its connectivity with various Azure services and components, and how to best use it to process billions of events/data every day.

References

Please refer to the following URL for further information:

Azure Event Hubs - Process Apache Kafka events

Get HCL Technologies Insights and Updates delivered to your inbox