Kafka Streams in Action

Kafka Streams in Action
Author: Bill Bejeck
Publisher: Simon and Schuster
Total Pages: 410
Release: 2018-08-29
Genre: Computers
ISBN: 1638356025

Summary Kafka Streams in Action teaches you everything you need to know to implement stream processing on data flowing into your Kafka platform, allowing you to focus on getting more from your data without sacrificing time or effort. Foreword by Neha Narkhede, Cocreator of Apache Kafka Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Not all stream-based applications require a dedicated processing cluster. The lightweight Kafka Streams library provides exactly the power and simplicity you need for message handling in microservices and real-time event processing. With the Kafka Streams API, you filter and transform data streams with just Kafka and your application. About the Book Kafka Streams in Action teaches you to implement stream processing within the Kafka platform. In this easy-to-follow book, you'll explore real-world examples to collect, transform, and aggregate data, work with multiple processors, and handle real-time events. You'll even dive into streaming SQL with KSQL! Practical to the very end, it finishes with testing and operational aspects, such as monitoring and debugging. What's inside Using the KStreams API Filtering, transforming, and splitting data Working with the Processor API Integrating with external systems About the Reader Assumes some experience with distributed systems. No knowledge of Kafka or streaming applications required. About the Author Bill Bejeck is a Kafka Streams contributor and Confluent engineer with over 15 years of software development experience. Table of Contents PART 1 - GETTING STARTED WITH KAFKA STREAMS Welcome to Kafka Streams Kafka quicklyPART 2 - KAFKA STREAMS DEVELOPMENT Developing Kafka Streams Streams and state The KTable API The Processor APIPART 3 - ADMINISTERING KAFKA STREAMS Monitoring and performance Testing a Kafka Streams applicationPART 4 - ADVANCED CONCEPTS WITH KAFKA STREAMS Advanced applications with Kafka StreamsAPPENDIXES Appendix A - Additional configuration information Appendix B - Exactly once semantics

Kafka in Action

Kafka in Action
Author: Dylan Scott
Publisher: Simon and Schuster
Total Pages: 270
Release: 2022-03-22
Genre: Computers
ISBN: 163835619X

Master the wicked-fast Apache Kafka streaming platform through hands-on examples and real-world projects. In Kafka in Action you will learn: Understanding Apache Kafka concepts Setting up and executing basic ETL tasks using Kafka Connect Using Kafka as part of a large data project team Performing administrative tasks Producing and consuming event streams Working with Kafka from Java applications Implementing Kafka as a message queue Kafka in Action is a fast-paced introduction to every aspect of working with Apache Kafka. Starting with an overview of Kafka's core concepts, you'll immediately learn how to set up and execute basic data movement tasks and how to produce and consume streams of events. Advancing quickly, you’ll soon be ready to use Kafka in your day-to-day workflow, and start digging into even more advanced Kafka topics. About the technology Think of Apache Kafka as a high performance software bus that facilitates event streaming, logging, analytics, and other data pipeline tasks. With Kafka, you can easily build features like operational data monitoring and large-scale event processing into both large and small-scale applications. About the book Kafka in Action introduces the core features of Kafka, along with relevant examples of how to use it in real applications. In it, you’ll explore the most common use cases such as logging and managing streaming data. When you’re done, you’ll be ready to handle both basic developer- and admin-based tasks in a Kafka-focused team. What's inside Kafka as an event streaming platform Kafka producers and consumers from Java applications Kafka as part of a large data project About the reader For intermediate Java developers or data engineers. No prior knowledge of Kafka required. About the author Dylan Scott is a software developer in the insurance industry. Viktor Gamov is a Kafka-focused developer advocate. At Confluent, Dave Klein helps developers, teams, and enterprises harness the power of event streaming with Apache Kafka. Table of Contents PART 1 GETTING STARTED 1 Introduction to Kafka 2 Getting to know Kafka PART 2 APPLYING KAFK 3 Designing a Kafka project 4 Producers: Sourcing data 5 Consumers: Unlocking data 6 Brokers 7 Topics and partitions 8 Kafka storage 9 Management: Tools and logging PART 3 GOING FURTHER 10 Protecting Kafka 11 Schema registry 12 Stream processing with Kafka Streams and ksqlDB

Event Streams in Action

Event Streams in Action
Author: Valentin Crettaz
Publisher: Simon and Schuster
Total Pages: 485
Release: 2019-05-10
Genre: Computers
ISBN: 1638355835

Summary Event Streams in Action is a foundational book introducing the ULP paradigm and presenting techniques to use it effectively in data-rich environments. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Many high-profile applications, like LinkedIn and Netflix, deliver nimble, responsive performance by reacting to user and system events as they occur. In large-scale systems, this requires efficiently monitoring, managing, and reacting to multiple event streams. Tools like Kafka, along with innovative patterns like unified log processing, help create a coherent data processing architecture for event-based applications. About the Book Event Streams in Action teaches you techniques for aggregating, storing, and processing event streams using the unified log processing pattern. In this hands-on guide, you'll discover important application designs like the lambda architecture, stream aggregation, and event reprocessing. You'll also explore scaling, resiliency, advanced stream patterns, and much more! By the time you're finished, you'll be designing large-scale data-driven applications that are easier to build, deploy, and maintain. What's inside Validating and monitoring event streams Event analytics Methods for event modeling Examples using Apache Kafka and Amazon Kinesis About the Reader For readers with experience coding in Java, Scala, or Python. About the Author Alexander Dean developed Snowplow, an open source event processing and analytics platform. Valentin Crettaz is an independent IT consultant with 25 years of experience. Table of Contents PART 1 - EVENT STREAMS AND UNIFIED LOGS Introducing event streams The unified log 24 Event stream processing with Apache Kafka Event stream processing with Amazon Kinesis Stateful stream processing PART 2- DATA ENGINEERING WITH STREAMS Schemas Archiving events Railway-oriented processing Commands PART 3 - EVENT ANALYTICS Analytics-on-read Analytics-on-write

Mastering Kafka Streams and ksqlDB

Mastering Kafka Streams and ksqlDB
Author: Mitch Seymour
Publisher: "O'Reilly Media, Inc."
Total Pages: 505
Release: 2021-02-04
Genre: Computers
ISBN: 1492062448

Working with unbounded and fast-moving data streams has historically been difficult. But with Kafka Streams and ksqlDB, building stream processing applications is easy and fun. This practical guide shows data engineers how to use these tools to build highly scalable stream processing applications for moving, enriching, and transforming large amounts of data in real time. Mitch Seymour, data services engineer at Mailchimp, explains important stream processing concepts against a backdrop of several interesting business problems. You'll learn the strengths of both Kafka Streams and ksqlDB to help you choose the best tool for each unique stream processing project. Non-Java developers will find the ksqlDB path to be an especially gentle introduction to stream processing. Learn the basics of Kafka and the pub/sub communication pattern Build stateless and stateful stream processing applications using Kafka Streams and ksqlDB Perform advanced stateful operations, including windowed joins and aggregations Understand how stateful processing works under the hood Learn about ksqlDB's data integration features, powered by Kafka Connect Work with different types of collections in ksqlDB and perform push and pull queries Deploy your Kafka Streams and ksqlDB applications to production

Kafka: The Definitive Guide

Kafka: The Definitive Guide
Author: Neha Narkhede
Publisher: "O'Reilly Media, Inc."
Total Pages: 315
Release: 2017-08-31
Genre: Computers
ISBN: 1491936118

Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer. Understand publish-subscribe messaging and how it fits in the big data ecosystem. Explore Kafka producers and consumers for writing and reading messages Understand Kafka patterns and use-case requirements to ensure reliable data delivery Get best practices for building data pipelines and applications with Kafka Manage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasks Learn the most critical metrics among Kafka’s operational measurements Explore how Kafka’s stream delivery capabilities make it a perfect source for stream processing systems

Kafka Streams - Real-time Stream Processing

Kafka Streams - Real-time Stream Processing
Author: Prashant Kumar Pandey
Publisher: Learning Journal
Total Pages: 350
Release: 2019-03-26
Genre: Computers
ISBN: 9353517257

The book Kafka Streams - Real-time Stream Processing helps you understand the stream processing in general and apply that skill to Kafka streams programming. This book is focusing mainly on the new generation of the Kafka Streams library available in the Apache Kafka 2.x. The primary focus of this book is on Kafka Streams. However, the book also touches on the other Apache Kafka capabilities and concepts that are necessary to grasp the Kafka Streams programming. Who should read this book? Kafka Streams: Real-time Stream Processing is written for software engineers willing to develop a stream processing application using Kafka Streams library. I am also writing this book for data architects and data engineers who are responsible for designing and building the organization’s data-centric infrastructure. Another group of people is the managers and architects who do not directly work with Kafka implementation, but they work with the people who implement Kafka Streams at the ground level. What should you already know? This book assumes that the reader is familiar with the basics of Java programming language. The source code and examples in this book are using Java 8, and I will be using Java 8 lambda syntax, so experience with lambda will be helpful. Kafka Streams is a library that runs on Kafka. Having a good fundamental knowledge of Kafka is essential to get the most out of Kafka Streams. I will touch base on the mandatory Kafka concepts for those who are new to Kafka. The book also assumes that you have some familiarity and experience in running and working on the Linux operating system.

Effective Kafka

Effective Kafka
Author: Emil Koutanov
Publisher:
Total Pages: 466
Release: 2020-03-17
Genre:
ISBN:

The software architecture landscape has evolved dramatically over the past decade. Microservices have displaced monoliths. Data and applications are increasingly becoming distributed and decentralised. But composing disparate systems is a hard problem. More recently, software practitioners have been rapidly converging on event-driven architecture as a sustainable way of dealing with complexity - integrating systems without increasing their coupling.In Effective Kafka, Emil Koutanov explores the fundamentals of Event-Driven Architecture - using Apache Kafka - the world's most popular and supported open-source event streaming platform.You'll learn: - The fundamentals of event-driven architecture and event streaming platforms- The background and rationale behind Apache Kafka, its numerous potential uses and applications- The architecture and core concepts - the underlying software components, partitioning and parallelism, load-balancing, record ordering and consistency modes- Installation of Kafka and related tooling - using standalone deployments, clusters, and containerised deployments with Docker- Using CLI tools to interact with and administer Kafka classes, as well as publishing data and browsing topics- Using third-party web-based tools for monitoring a cluster and gaining insights into the event streams- Building stream processing applications in Java 11 using off-the-shelf client libraries- Patterns and best-practice for organising the application architecture, with emphasis on maintainability and testability of the resulting code- The numerous gotchas that lurk in Kafka's client and broker configuration, and how to counter them- Theoretical background on distributed and concurrent computing, exploring factors affecting their liveness and safety- Best-practices for running multi-tenanted clusters across diverse engineering teams, how teams collaborate to build complex systems at scale and equitably share the cluster with the aid of quotas- Operational aspects of running Kafka clusters at scale, performance tuning and methods for optimising network and storage utilisation- All aspects of Kafka security -including network segregation, encryption, certificates, authentication and authorization.The coverage is progressively delivered and carefully aimed at giving you a journey-like experience into becoming proficient with Apache Kafka and Event-Driven Architecture. The goal is to get you designing and building applications. And by the conclusion of this book, you will be a confident practitioner and a Kafka evangelist within your organisation - wielding the knowledge necessary to teach others.

Apache Pulsar in Action

Apache Pulsar in Action
Author: David Kjerrumgaard
Publisher: Simon and Schuster
Total Pages: 398
Release: 2021-12-14
Genre: Computers
ISBN: 1617296880

Distributed applications demand reliable, high-performance messaging. The Apache Pulsar server-to-server messaging system provides a secure, stable platform without the need for a stream processing engine like Spark. Contributed by Yahoo to the Apache Foundation, Pulsar is mature and battle-tested, handling millions of messages per second for over three years at Yahoo. Apache Pulsar in Action is a comprehensive and practical guide to building high-traffic applications with Pulsar, delivering extreme levels of speed and durability. about the technology Pulsar is a streaming messaging system designed for high performance server-to-server messaging. Built and tested under intense conditions at Yahoo, Pulsar has been proven in production and can handle millions of messages per second. Now free and open-source, Pulsar''s unique architecture helps solve some of the challenges of modern development. Pulsar avoids latency in streaming data transmission, making it a powerful tool for IoT Edge analytics. Its unified messaging model improves the performance of microservices architecture, and its tiered storage capabilities allow for larger volumes of data to be handled without fear of data loss. Pulsar''s flexible API interface works with Java, C++, Python, and Go, making it easy to incorporate Pulsar into your stack. about the book Apache Pulsar in Action is a hands-on guide to building scalable streaming messaging systems for distributed applications and microservices systems. You''ll start with Pulsar''s fundamentals, each illustrated by real-world examples, as you get to grips with Pulsar''s unique architecture. Pulsar contributor David Kjerrumgaard teaches the skills you need to deploy a Pulsar server, ingest data from third-party systems, and deploy lightweight computing logic with simple functions. You''ll learn to employ Pulsar''s seamless scalability through relatable case studies, including an IOT analytics application that can be deployed within a resource constrained environment and a microservices application based on Pulsar functions. At the end of this practical book, you''ll be ready to fully take advantage of Pulsar to create high-traffic message-driven applications. what''s inside Publish from Apache Pulsar into third-party data repositories and platforms Design and develop Apache Pulsar functions Perform interactive SQL queries against data stored in Apache Pulsar Examples of Pulsar-based microservices that you can download and try yourself about the reader Written for experienced Java developers. No prior knowledge of Pulsar is needed. about the author David Kjerrumgaard is the Director of Solution Architecture at Streamlio, and a contributor to the Apache Pulsar and Apache NiFi projects.

Reactive Streams in Java

Reactive Streams in Java
Author: Adam L. Davis
Publisher: Apress
Total Pages: 146
Release: 2018-11-29
Genre: Computers
ISBN: 1484241762

Get an easy introduction to reactive streams in Java to handle concurrency, data streams, and the propagation of change in today's applications. This compact book includes in-depth introductions to RxJava, Akka Streams, and Reactor, and integrates the latest related features from Java 9 and 11, as well as reactive streams programming with the Android SDK. Reactive Streams in Java explains how to manage the exchange of stream data across an asynchronous boundary—passing elements on to another thread or thread-pool—while ensuring that the receiving side is not forced to buffer arbitrary amounts of data which can reduce application efficiency. After reading and using this book, you'll be proficient in programming reactive streams for Java in order to optimize application performance, and improve memory management and data exchanges. What You Will Learn Discover reactive streams and how to use them Work with the latest features in Java 9 and Java 11Apply reactive streams using RxJava Program using Akka StreamsCarry out reactive streams programming in Android Who This Book Is For Experienced Java programmers.

Real-Time Streaming with Apache Kafka, Spark, and Storm

Real-Time Streaming with Apache Kafka, Spark, and Storm
Author: Brindha Priyadarshini Jeyaraman
Publisher: BPB Publications
Total Pages: 196
Release: 2021-08-20
Genre: Computers
ISBN: 9390684595

Build a platform using Apache Kafka, Spark, and Storm to generate real-time data insights and view them through Dashboards. KEY FEATURES ● Extensive practical demonstration of Apache Kafka concepts, including producer and consumer examples. ● Includes graphical examples and explanations of implementing Kafka Producer and Kafka Consumer commands and methods. ● Covers integration and implementation of Spark-Kafka and Kafka-Storm architectures. DESCRIPTION Real-Time Streaming with Apache Kafka, Spark, and Storm is a book that provides an overview of the real-time streaming concepts and architectures of Apache Kafka, Storm, and Spark. The readers will learn how to build systems that can process data streams in real time using these technologies. They will be able to process a large amount of real-time data and perform analytics or generate insights as a result of this. The architecture of Kafka and its various components are described in detail. A Kafka Cluster installation and configuration will be demonstrated. The Kafka publisher-subscriber system will be implemented in the Eclipse IDE using the Command Line and Java. The book discusses the architecture of Apache Storm, the concepts of Spout and Bolt, as well as their applications in a Transaction Alert System. It also describes Spark's core concepts, applications, and the use of Spark to implement a microservice. To learn about the process of integrating Kafka and Storm, two approaches to Spark and Kafka integration will be discussed. This book will assist a software engineer to transition to a Big Data engineer and Big Data architect by providing knowledge of big data processing and the architectures of Kafka, Storm, and Spark Streaming. WHAT YOU WILL LEARN ● Creation of Kafka producers, consumers, and brokers using command line. ● End-to-end implementation of Kafka messaging system with Java in Eclipse. ● Perform installation and creation of a Storm Cluster and execute Storm Management commands. ● Implement Spouts, Bolts and a Topology in Storm for Transaction alert application system. ● Perform the implementation of a microservice using Spark in Scala IDE. ● Learn about the various approaches of integrating Kafka and Spark. ● Perform integration of Kafka and Storm using Java in the Eclipse IDE. WHO THIS BOOK IS FOR This book is intended for Software Developers, Data Scientists, and Big Data Architects who want to build software systems to process data streams in real time. To understand the concepts in this book, knowledge of any programming language such as Java, Python, etc. is needed. TABLE OF CONTENTS 1. Introduction to Kafka 2. Installing Kafka 3. Kafka Messaging 4. Kafka Producers 5. Kafka Consumers 6. Introduction to Storm 7. Installation and Configuration 8. Spouts and Bolts 9. Introduction to Spark 10. Spark Streaming 11. Kafka Integration with Storm 12. Kafka Integration with Spark