A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues

Overview

Nakadi Event Broker

Build Status codecov.io Codacy Badge

Nakadi is a distributed event bus broker that implements a RESTful API abstraction on top of Kafka-like queues, which can be used to send, receive, and analyze streaming data in real time, in a reliable and highly available manner.

One of the most prominent use cases of Nakadi is to decouple micro-services by building data streams between producers and consumers.

Main users of nakadi are developers and analysts. Nakadi provides features like REST based integration, multi consumer, ordered delivery, interactive UI, fully managed, security, ensuring data quality, abstraction of big data technology, and push model based consumption.

Nakadi is in active developement and is currently in production inside Zalando as the backbone of our microservices sending millions of events daily with a throughput of more than hundreds gigabytes per second. In one line, Nakadi is a high-scalability data-stream for enterprise engineering teams.

Nakadi Deployment Diagram

More detailed information can be found on our website.

Project goal

The goal of Nakadi (ნაკადი means stream in Georgian) is to provide an event broker infrastructure to:

  • Abstract event delivery via a secured RESTful API.

    This allows microservices teams to maintain service boundaries, and not directly depend on any specific message broker technology. Access can be managed individually for every queue and secured using OAuth and custom authorization plugins.

  • Enable convenient development of event-driven applications and asynchronous microservices.

    Event types can be defined with Event type schemas and managed via a registry. All events will be validated against the schema before publishing. This guarantees data quality and consistency for consumers.

  • Efficient low latency event delivery.

    Once a publisher sends an event using a simple HTTP POST, consumers can be pushed to via a streaming HTTP connection, allowing near real-time event processing. The consumer connection has keepalive controls and support for managing stream offsets using subscriptions.

Development status

  • Nakadi is high-load production ready.
  • Zalando uses Nakadi as its central Event Bus Service.
  • Nakadi reliably handles the traffic from thousands event types with the throughput of more than hundreds gigabytes per second.
  • The project is in active development.

Presentations

Features

  • Stream:
    • REST abstraction over Kafka-like queues.
    • CRUD for event types.
    • Event batch publishing.
    • Low-level interface (deprecated).
      • manual client side partition management is needed
      • no support of commits
    • High-level interface (Subscription API).
      • automatic redistribution of partitions between consuming clients
      • commits should be issued to move server-side cursors
  • Schema:
    • Schema registry.
    • Several event type categories (Undefined, Business, Data Change).
    • Several partitioning strategies (Random, Hash, User defined).
    • Event enrichment strategies.
    • Schema evolution.
    • Events validation using an event type schema.
  • Security:
    • OAuth2 authentication.
    • Per-event type authorization.
    • Blacklist of users and applications.
  • Operations:
    • STUPS platform compatible.
    • ZMON monitoring compatible.
    • SLO monitoring.
    • Timelines:
      • this allows transparently switch production and consumption to different cluster (tier, region, AZ) without moving actual data and any service degradation.
      • opens the possibility for implementation of other streaming technologies and engines besides Kafka (like Amazon Kinesis or Google Cloud Pub/Sub)

Read more about latest development on the releases page.

Additional features that we plan to cover in the future are:

  • Support for different streaming technologies and engines. Nakadi currently uses Apache Kafka as its broker, but other providers (such as Kinesis) will be possible.
  • Filtering of events for subscribing consumers.
  • Store old published events forever using transparent fall back backup shortages like AWS S3.
  • Separate the internal schema register to standalone service.
  • Use additional schema formats and protocols like Avro, protobuf and others.

Related projects

The zalando-nakadi organisation contains many useful related projects like

How to contribute to Nakadi

Read our contribution guidelines on how to submit issues and pull requests, then get Nakadi up and running locally using Docker:

Dependencies

The Nakadi server is a Java 8 Spring Boot application. It uses Kafka 1.1.1 as its broker and PostgreSQL 9.5 as its supporting database.

Nakadi requires recent versions of docker and docker-compose. In particular, docker-compose >= v1.7.0 is required. See Install Docker Compose for information on installing the most recent docker-compose version.

The project is built with Gradle. The ./gradlew wrapper script will bootstrap the right Gradle version if it's not already installed.

Install

To get the source, clone the git repository.

git clone https://github.com/zalando/nakadi.git

Building

The gradle setup is fairly standard, the main tasks are:

  • ./gradlew build: run a build and test
  • ./gradlew clean: clean down the build

Some other useful tasks are:

  • ./gradlew startNakadi: build Nakadi and start docker-compose services: nakadi, postgresql, zookeeper and kafka
  • ./gradlew stopNakadi: shutdown docker-compose services
  • ./gradlew startStorages: start docker-compose services: postgres, zookeeper and kafka (useful for development purposes)
  • ./gradlew stopStorages: shutdown docker-compose services
  • ./gradlew fullAcceptanceTest: start Nakadi configured for acceptance tests and run acceptance tests

For working with an IDE, the eclipse IDE task is available and you'll be able to import the build.gradle into Intellij IDEA directly.

Running a Server

From the project's home directory you can start Nakadi via Gradle:

./gradlew startNakadi

This will build the project and run docker compose with 4 services:

  • Nakadi (8080)
  • PostgreSQL (5432)
  • Kafka (9092)
  • Zookeeper (2181)

To stop the running Nakadi server:

./gradlew stopNakadi

Using Nakadi and its API

Please read the manual for the full API usage details.

Creating Event Types

The Nakadi API allows the publishing and consuming of events over HTTP. To do this the producer must register an event type with the Nakadi schema registry.

This example shows a minimalistic undefined category event type with a wildcard schema:

curl -v -XPOST http://localhost:8080/event-types -H "Content-type: application/json" -d '{
  "name": "order.ORDER_RECEIVED",
  "owning_application": "order-service",
  "category": "undefined",
  "schema": {
    "type": "json_schema",
    "schema": "{ \"additionalProperties\": true }"
  }
}'

Note: This is not a recommended category and schema. It should be used only for testing.

You can read more about this in the manual.

Consuming Events

You can open a stream for an event type via the events sub-resource:

curl -v http://localhost:8080/event-types/order.ORDER_RECEIVED/events


HTTP/1.1 200 OK

{"cursor":{"partition":"0","offset":"82376-000087231"},"events":[{"order_number": "ORDER_001"}]}
{"cursor":{"partition":"0","offset":"82376-000087232"}}
{"cursor":{"partition":"0","offset":"82376-000087232"},"events":[{"order_number": "ORDER_002"}]}
{"cursor":{"partition":"0","offset":"82376-000087233"},"events":[{"order_number": "ORDER_003"}]}

You will see the events when you publish them from another console for example. The records without events field are Keep Alive messages.

Note: the low-level API should be used only for debugging. It is not recommended for production systems. For production systems, please use the Subscriptions API.

Publishing Events

Events for an event type can be published by posting to its "events" collection:

curl -v -XPOST http://localhost:8080/event-types/order.ORDER_RECEIVED/events \
 -H "Content-type: application/json" \
 -d '[{
    "order_number": "24873243241"
  }, {
    "order_number": "24873243242"
  }]'


HTTP/1.1 200 OK  

Read more in the manual.

Contributing

Nakadi accepts contributions from the open-source community.

Please read CONTRIBUTING.md.

Please also note our CODE_OF_CONDUCT.md.

Contact

This email address serves as the main contact address for this project.

Bug reports and feature requests are more likely to be addressed if posted as issues here on GitHub.

License

Please read the full LICENSE

The MIT License (MIT) Copyright © 2015 Zalando SE, https://tech.zalando.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Comments
  • audit log

    audit log

    Audit log

    https://jira.zalando.net/browse/ARUHA-2123

    Description

    This PR contains code that captures entity content changes, including a snapshot of the state before and after the fact. I decided to store the entities both as objects and text (serialized json) in order to be sure that further transformations downstream - data lake flatmap, for example - would not destroy relevant structural information. But object is still present in order to make searching by entity attributes easier.

    opened by rcillo 26
  • Upgrade dependencies

    Upgrade dependencies

    Upgrade most dependencies to latest version

    Zalando ticket : ARUHA-1683

    Description

    This PR upgrades (almost) all dependencies, according to the following plan:

    • upgrade Spring boot to latest stable patch version
    • upgrade all spring boot dependencies to:
      • the same version as used in spring boot, if the previous version was lower
      • unchanged, if the version used was already more recent than the spring boot dependency
    • upgrade other dependencies to the latest stable version whenever possible
    • upgrade test dependencies to the latest stable version that does not break tests

    Exception: Zalando Problem libraries will be upgraded in a separate PR, as they require significant refactoring.

    Review

    • [ ] Tests
    • [ ] Documentation
    • [ ] CHANGELOG
    opened by lmontrieux 24
  • Aruha 473 check size of events

    Aruha 473 check size of events

    This PR checks the size of all events in a batch. If there is at least one event that is over the maximum size after enrichment, then the entire batch is rejected with a 422.

    opened by lmontrieux 24
  • ARUHA-1304: Nakadi collects the metric

    ARUHA-1304: Nakadi collects the metric

    opened by adyach 21
  • [WIP] SplitByWeight Method Refactoring

    [WIP] SplitByWeight Method Refactoring

    Could not resist to refactor splitByWeight method. It was using a greedy strategy with O(N^2)? algorithm. Refactored to strictly O(N) algorithms. The idea is simple: just accumulate a rounding error and carry it over to the next element.

    in progress 
    opened by satybald 20
  • Aruha 1501 choose partition hila

    Aruha 1501 choose partition hila

    Implement the functionality to read from a specific partitions of a subscription

    Zalando ticket : ARUHA-1501

    Description

    Implementation of functionality to allow users to select partitions to read from a subscription. API/documentation is here https://github.com/zalando/nakadi/pull/834

    Review

    • [x] Tests
    • [x] Documentation
    • [x] CHANGELOG

    Deployment Notes

    We should closely monitor the consumption of users

    opened by v-stepanov 18
  • ARUHA-1275 upgrade json-schema version

    ARUHA-1275 upgrade json-schema version

    Also upgrade org.json version

    ARUHA-1275: Get json-schema optimisation merged and released

    Description

    Upgrade version of json-schema library, with performance improvements

    Review

    • [ ] Tests
    • [ ] Documentation
    • [x] CHANGELOG

    Deployment Notes

    These should highlight any db migrations, feature toggles, etc.

    opened by lmontrieux 18
  • Aruha 778 db timeout

    Aruha 778 db timeout

    ARUHA-778 add timeout to database connections

    I'm opening this PR now, so we can test and argue over (and eventually agree on) the best value for each timeout.

    opened by lmontrieux 17
  • Aruha 456 block consumers

    Aruha 456 block consumers

    @antban @rcillo @adyach @v1ctor @sergkam please review (It was not yet tested properly, i will create a separate stack and will test it there, so if you notice some bug in the code - let me know :)

    I have no idea how this small feature could grow to 700 lines on code :(

    Concept document moved to wiki: https://github.com/zalando/nakadi/wiki/Too-much-consumers-from-a-single-application

    opened by v-stepanov 17
  • Support OpenTracing span context

    Support OpenTracing span context

    https://techjira.zalando.net/browse/ARUHA-1883

    In order to trace distributed systems, open tracing span context can be transported in the metadata of events. This way, publishers and consumer can share a span even when asynchronously communicating through Nakadi.

    opened by rcillo 16
  • aruha-190 define offsets format

    aruha-190 define offsets format

    Up to now Nakadi Cursors were advertised to be opaque entities and clients should not make assumptions about its structure and meaning.

    But we've been observing clients making assumptions and coding based on them. This could lead to a major confusion in the future if we no longer hold those assumptions true.

    In order to mitigate this risk, we are committing to a well defined structure beforehand. It's inspired by Postgres WAL format https://www.postgresql.org/docs/9.2/static/wal-internals.html and we aim to address two issues with this format:

    1. Make Nakadi operations easier (this feature would have been tremendously useful for cluster migration or implementing partition rebalacing)
    2. Provide a well defined set of guarantees for consumers to rely one.

    update

    As for why 27 characters = 9 (max integer 7FFFFFFF) + separator + 17 (max long 7FFFFFFFFFFFFFFF).

    opened by rcillo 16
  • Update to Gradle 7

    Update to Gradle 7

    Separated into 5 commits:

    Preparation is made in first 3 commits, minor cleanup plus improving performance by delaying/avoiding task creation. Main upgrade work is done in last 2 commits. Dependency changes were major but required, and builds still runs fine.

    opened by Jocce-Nilsson 0
  • Bump addressable from 2.6.0 to 2.8.1 in /docs

    Bump addressable from 2.6.0 to 2.8.1 in /docs

    Bumps addressable from 2.6.0 to 2.8.1.

    Changelog

    Sourced from addressable's changelog.

    Addressable 2.8.1

    • refactor Addressable::URI.normalize_path to address linter offenses (#430)
    • remove redundant colon in Addressable::URI::CharacterClasses::AUTHORITY regex (#438)
    • update gemspec to reflect supported Ruby versions (#466, #464, #463)
    • compatibility w/ public_suffix 5.x (#466, #465, #460)
    • fixes "invalid byte sequence in UTF-8" exception when unencoding URLs containing non UTF-8 characters (#459)
    • Ractor compatibility (#449)
    • use the whole string instead of a single line for template match (#431)
    • force UTF-8 encoding only if needed (#341)

    #460: sporkmonger/addressable#460 #463: sporkmonger/addressable#463 #464: sporkmonger/addressable#464 #465: sporkmonger/addressable#465 #466: sporkmonger/addressable#466

    Addressable 2.8.0

    • fixes ReDoS vulnerability in Addressable::Template#match
    • no longer replaces + with spaces in queries for non-http(s) schemes
    • fixed encoding ipv6 literals
    • the :compacted flag for normalized_query now dedupes parameters
    • fix broken escape_component alias
    • dropping support for Ruby 2.0 and 2.1
    • adding Ruby 3.0 compatibility for development tasks
    • drop support for rack-mount and remove Addressable::Template#generate
    • performance improvements
    • switch CI/CD to GitHub Actions

    Addressable 2.7.0

    • added :compacted flag to normalized_query
    • heuristic_parse handles mailto: more intuitively
    • dropped explicit support for JRuby 9.0.5.0
    • compatibility w/ public_suffix 4.x
    • performance improvements
    Commits
    • 8657465 Update version, gemspec, and CHANGELOG for 2.8.1 (#474)
    • 4fc5bb6 CI: remove Ubuntu 18.04 job (#473)
    • 860fede Force UTF-8 encoding only if needed (#341)
    • 99810af Merge pull request #431 from ojab/ct-_do_not_parse_multiline_strings
    • 7ce0f48 Merge branch 'main' into ct-_do_not_parse_multiline_strings
    • 7ecf751 Merge pull request #449 from okeeblow/freeze_concatenated_strings
    • 41f12dd Merge branch 'main' into freeze_concatenated_strings
    • 068f673 Merge pull request #459 from jarthod/iso-encoding-problem
    • b4c9882 Merge branch 'main' into iso-encoding-problem
    • 08d27e8 Merge pull request #471 from sporkmonger/sporkmonger-enable-codeql
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Queue publishing of events for the same key in a batch

    Queue publishing of events for the same key in a batch

    When hash partitioning (and/or log compaction) is used this approach ensures that the order of events is preserved for each key in the face of intermittent publishing errors.

    For now we simply mark any events that were not attempted for publishing as aborted. Later this can be improved to implement retry of failed events.

    No change of behavior when no event keys are set (random or user-defined partitioning): the whole batch is submitted as a single chunk.

    opened by a1exsh 0
Releases(r3.3.9-2019-10-31)
Owner
Zalando SE
The org page for Zalando, Europe's leading online fashion platform. Visit opensource.zalando.com for project stats.
Zalando SE
Powerful event-bus optimized for high throughput in multi-threaded applications. Features: Sync and Async event publication, weak/strong references, event filtering, annotation driven

MBassador MBassador is a light-weight, high-performance event bus implementing the publish subscribe pattern. It is designed for ease of use and aims

Benjamin Diedrichsen 930 Jan 6, 2023
Event bus for Android and Java that simplifies communication between Activities, Fragments, Threads, Services, etc. Less code, better quality.

EventBus EventBus is a publish/subscribe event bus for Android and Java. EventBus... simplifies the communication between components decouples event s

Markus Junginger 24.2k Jan 3, 2023
Microservice-based online payment system for customers and merchants using RESTful APIs and message queues

Microservice-based online payment system for customers and merchants using RESTful APIs and message queues

Daniel Larsen 1 Mar 23, 2022
EventStoreDB is the database for Event Sourcing. This repository provides a sample of event sourced system that uses EventStoreDB as event store.

Event Sourcing with EventStoreDB Introduction Example Domain Event Sourcing and CQRS 101 State-Oriented Persistence Event Sourcing CQRS Advantages of

Evgeniy Khyst 53 Dec 15, 2022
Fast and reliable message broker built on top of Kafka.

Hermes Hermes is an asynchronous message broker built on top of Kafka. We provide reliable, fault tolerant REST interface for message publishing and a

Allegro Tech 742 Jan 3, 2023
A template and introduction for the first kafka stream application. The readme file contains all the required commands to run the Kafka cluster from Scrach

Kafka Streams Template Maven Project This project will be used to create the followings: A Kafka Producer Application that will start producing random

null 2 Jan 10, 2022
Demo project for Kafka Ignite streamer, Kafka as source and Ignite cache as sink

ignite-kafka-streamer **Description : Demo project for Kafka Ignite streamer, Kafka as source and Ignite cache as sink Step-1) Run both Zookeeper and

null 1 Feb 1, 2022
Kafka example - a simple producer and consumer for kafka using spring boot + java

Kafka example - a simple producer and consumer for kafka using spring boot + java

arturcampos 1 Feb 18, 2022
ASTM 1394 implements CLSI LIS2-A2 (formerly ASTM E1394)

ASTM 1394 реализует стандарт CLSI LIS2-A2 (ранее ASTM E1394)

Nikita Konstantinovich Chistousov 4 Dec 23, 2022
Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.

Dagger Dagger or Data Aggregator is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processi

Open DataOps Foundation 238 Dec 22, 2022
Tiny and fast event dispatcher.

HookDispatcher - Tiny and fast event dispatcher. Installation Gradle repositories { maven { url 'https://jitpack.io' } } dependencies { imple

null 8 Dec 7, 2021
Plugin for keycloak that serves as an event listener, displaying user information in the log when there are registration and login events

Keycloak - Event listener Details Plugin for keycloak that serves as an event listener, displaying user information in the log when there are registra

José alisson 2 Jan 14, 2022
Mirror of Apache Kafka

Apache Kafka See our web site for details on the project. You need to have Java installed. We build and test Apache Kafka with Java 8, 11 and 15. We s

The Apache Software Foundation 23.9k Jan 5, 2023
Kryptonite is a turn-key ready transformation (SMT) for Apache Kafka® Connect to do field-level 🔒 encryption/decryption 🔓 of records. It's an UNOFFICIAL community project.

Kryptonite - An SMT for Kafka Connect Kryptonite is a turn-key ready transformation (SMT) for Apache Kafka® to do field-level encryption/decryption of

Hans-Peter Grahsl 53 Jan 3, 2023
Dataflow template which read data from Kafka (Support SSL), transform, and outputs the resulting records to BigQuery

Kafka to BigQuery Dataflow Template The pipeline template read data from Kafka (Support SSL), transform the data and outputs the resulting records to

DoiT International 12 Jun 1, 2021
A command line client for Kafka Connect

kcctl -- A CLI for Apache Kafka Connect This project is a command-line client for Kafka Connect. Relying on the idioms and semantics of kubectl, it al

Gunnar Morling 274 Dec 19, 2022
A command line client for Kafka Connect

?? kcctl – Your Cuddly CLI for Apache Kafka Connect This project is a command-line client for Kafka Connect. Relying on the idioms and semantics of ku

kcctl 274 Dec 19, 2022
Publish Kafka messages from HTTP

Kafka Bridge Publish Kafka messages from HTTP Configuration Example configuration for commonly used user + password authentication: kafka-bridge: ka

neuland - Büro für Informatik 4 Nov 9, 2021
Implementação de teste com Kafka

TesteKafka01 Implementação de teste com Kafka Projeto criado para estudo e testes com Kafka Recursos que estarão disponiveis: -Envio de msg -Recebe Ms

null 3 Sep 17, 2021