51 lines
3.2 KiB
Plaintext
51 lines
3.2 KiB
Plaintext
[id="debezium-architecture"]
|
|
= {prodname} Architecture
|
|
|
|
|
|
Most commonly, {prodname} is deployed via Apache {link-kafka-docs}/#connect[Kafka Connect].
|
|
Kafka Connect is a framework and runtime for implementing and operating
|
|
|
|
* source connectors such as {prodname}, which ingest data into Kafka and
|
|
* sink connectors, which propagate data from Kafka topics into other systems.
|
|
|
|
The following image shows the architecture of a CDC pipeline based on {prodname}:
|
|
|
|
image::debezium-architecture.png[{prodname} Architecture]
|
|
|
|
Kafka Connect is operated as a separate service besides the Kafka broker itself.
|
|
The {prodname} connectors for MySQL and Postgres are deployed to capture the changes out of these two databases.
|
|
For this purpose, the two connectors establish a connection to the two source databases
|
|
using a client library for accessing the binlog in case of MySQL and reading from a logical replication stream in case of Postgres.
|
|
|
|
By default, the changes from one capture table are written to a corresponding Kafka topic.
|
|
If needed, the topic name can be adjusted with help of {prodname}'s {link-prefix}:{link-topic-routing}[topic routing SMT],
|
|
e.g. to use topic names that deviate from the captured tables names or to stream changes from multiple tables into a single topic.
|
|
|
|
Once the change events are in Apache Kafka, different connectors from the Kafka Connect eco-system can be used
|
|
to stream the changes to other systems and databases such as Elasticsearch, data warehouses and analytics systems or caches such as Infinispan.
|
|
Depending on the chosen sink connector, it may be needed to apply {prodname}'s {link-prefix}:{link-event-flattening}[new record state extraction] SMT,
|
|
which will only propagate the "after" structure from {prodname}'s event envelope to the sink connector.
|
|
|
|
ifdef::community[]
|
|
== {prodname} server
|
|
|
|
Another way to deploy {prodname} is using the xref:operations/debezium-server.adoc[standalone server].
|
|
The {prodname} server is a read-to-use application that streams change events from the source database to a variety of messaging infrastructures.
|
|
|
|
The following image shows the architecture of a CDC pipeline using the {prodname} server:
|
|
|
|
image:debezium-server-architecture.png[{prodname} Architecture]
|
|
|
|
The {prodname} server is configured to use one of the {prodname} source connectors to capture changes from the source database.
|
|
By default, the changes from one capture table in the source database are emitted as change records and consumed by one of a variety of sink messaging infrastructures like Amazon Kinesis, Google Cloud Pub/Sub, or Apache Pulsar.
|
|
|
|
== Embedded Engine
|
|
|
|
An alternative way for using the {prodname} connectors is the xref:operations/embedded.adoc[embedded engine].
|
|
In this case, {prodname} will not be run via Kafka Connect, but as a library embedded into your custom Java applications.
|
|
This can be useful for either consuming change events within your application itself,
|
|
without the needed for deploying complete Kafka and Kafka Connect clusters,
|
|
or for streaming changes to alternative messaging brokers such as Amazon Kinesis.
|
|
You can find https://github.com/debezium/debezium-examples/tree/master/kinesis[an example] for the latter in the examples repository.
|
|
endif::community[]
|