Apache Kafka was originated at LinkedIn and later became an open-sourced Apache project in 2011, then First-class Apache project in 2012. Kafka is written in Scala and Java. Apache Kafka is a publish-subscribe based fault-tolerant messaging system. It is fast, scalable and distributed by design.
This book will explore the principles of Kafka, installation, operations and then it will walk you through with the deployment of Kafka cluster. Finally, we will conclude with real-time applications and integration with Big Data Technologies.
In Big Data, an enormous volume of data is used. Regarding data, we have two main challenges.The first challenge is how to collect large volume of data and the second challenge is to analyze the collected data. To overcome those challenges, you must need a messaging system.
Kafka is designed for distributed high throughput systems. Kafka tends to work very well as a replacement for a more traditional message broker. In comparison to other messaging systems, Kafka has better throughput, built-in partitioning, replication and inherent fault-tolerance, which makes it a good fit for large-scale message processing applications.
This book has been prepared for professionals aspiring to make a career in Big Data Analytics using Apache Kafka messaging system. It will give you enough understanding on how to use Kafka clusters.
Before proceeding with this book, you must have a good understanding of Java, Scala, Dis-tributed messaging system, and Linux environment.
Complete Guide to Open Source Big Data Stack Название: Complete Guide to Open Source Big Data Stack Автор: Michael Frampton Издательство: Apress Год: 2018 Страниц: 365 Формат: PDF, EPUB Размер:...