DBZ-1668 Documentation Updates [ci skip]

* Removed unnecessary roles and options for many adoc tables
* Fixed various typos
* Added consistent incubating note
* Minor tweaks to several code listings to include caption/titles
* Fixed several rendering bugs with bullet lists not being properly rendered
* Added appropriate source code types to source blocks
This commit is contained in:
Chris Cranford 2020-01-17 12:58:30 -05:00 committed by Gunnar Morling
parent d43fdbced5
commit 075bd57a03
15 changed files with 126 additions and 103 deletions

View File

@ -139,7 +139,7 @@ For `DELETE` events, this option is only supported when the `delete.handling.mod
[[configuration_options]]
== Configuration options
[cols="35%a,10%a,55%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",width=100,options="header"]
|=======================
|Property
|Default

View File

@ -253,7 +253,7 @@ For `DELETE` events, this option is only supported when the `delete.handling.mod
[[configuration_options]]
== Configuration options
[cols="35%a,10%a,55%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",width=100,options="header"]
|=======================
|Property
|Default

View File

@ -83,7 +83,7 @@ This result was achieved with the link:#configuration-options[default configurat
[[configuration-options]]
=== Configuration options
[cols="30%a,10%a,10%a,50%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,10%a,50%a",width=100,options="header"]
|=======================
|Property
|Default
@ -175,7 +175,7 @@ payload | jsonb |
After observing all those pieces we can see what the default configuration does:
[cols="30%a,70%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,70%a",width=100,options="header"]
|=======================
|Table Column
|Effect

View File

@ -8,6 +8,11 @@ include::../_attributes.adoc[]
toc::[]
[NOTE]
====
This connector is currently in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive. Please let us know if you encounter any problems.
====
The Cassanadra connector can monitor a Cassandra cluster and record all row-level changes. The connector must be deployed locally on each node in the Cassandra cluster. The first time the connector connects to a Cassandra node, it performs a snapshot of all CDC-enabled tables in all keyspaces. The connector will also read the changes that are written to Cassandra commit logs and generates corresponding insert, update, and delete events. All events for each table are recorded in a separate kafka topic, where they can be consumed easily by applications and services.
@ -30,6 +35,7 @@ Cassandra is different from the other Debezium connectors since it is not implem
[WARNING]
====
The following features are currently not supported by the Cassandra connector. Changes resulted from any of these features are ignored:
* TTLs
* Range deletes
* Static columns
@ -49,21 +55,20 @@ Before the Debezium Cassandra connector can be used to monitor the changes in a
To enable CDC, update the following CDC config in `cassandra.yaml`:
[source]
[source,yaml]
----
cdc_enabled: true
----
Additional CDC configs are have the following default values:
[source]
[source,yaml]
----
cdc_raw_directory: $CASSANDRA_HOME/data/cdc_raw
cdc_free_space_in_mb: 4096
cdc_free_space_check_interval_ms: 250
cdc_raw_directory: $CASSANDRA_HOME/data/cdc_raw
cdc_free_space_in_mb: 4096
cdc_free_space_check_interval_ms: 250
----
where:
* `cdc_enabled` enables or disables CDC operations node-wide
* `cdc_free_space_in_mb` determines the destination for commit log segments to be moved after all corresponding memtables are flushed
* `cdc_free_space_in_mb` is the maximum capacity allocated to store commit log segments, and defaults to the minimum of 4096 MB and 1/8 of volume space.
@ -75,7 +80,7 @@ where:
Once CDC is enabled on the Cassandra node, each table must be be explicitly enabled for CDC as well via the CREATE TABLE or ALTER TABLE command. For example:
[source]
[source,sql]
----
CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
@ -129,7 +134,7 @@ Cassandra's commit logs come with a set of limitations, which are critical for i
* Cassandra does not perform read-before-write, as a result commit logs do not record the value of every column in the changed row, it only records the values of columns that have been modified (except for partition key columns, which are always recorded as they are required in Cassandra DML commands).
* Due to the nature of CQL, _insert_ DMLs can result in a row insertion or update; _update_ DMLs can result in a row insertion, update, or deletion; _delete_ DMLs can result in a row update or deletion. Since queries are not recorded in commit logs, CDC event type is classified based on the effect on the row in a relational database sense.
#TODO: is there a way to determine event type which corresponds to the actual Cassandra DML statement? and if so, is that preferred over the semantic of these events?
**TODO**: is there a way to determine event type which corresponds to the actual Cassandra DML statement? and if so, is that preferred over the semantic of these events?
[NOTE]
====
@ -155,18 +160,18 @@ For example, consider a Cassandra installation with an `inventory` keyspace that
* `fulfillment.inventory.customers`
* `fulfillment.inventory.orders`
#TODO: for topic name, is _clusterName_._keyspaceName_._tableName_ okay? or should it be _connectorName_._keyspaceName_._tableName_ or _connectorName_._clusterName_._keyspaceName_._tableName_?
**TODO**: for topic name, is _clusterName_._keyspaceName_._tableName_ okay? or should it be _connectorName_._keyspaceName_._tableName_ or _connectorName_._clusterName_._keyspaceName_._tableName_?
[[schema-evolution]]
=== Schema Evolution
DDLs are not recorded in commit logs. When the schema of a table change, this change is issued from one of the Cassandra node and propagated to other nodes via Gossip Protocol. This implies detection of schema changes are achieved on a best-effort basis. This is done by periodically polling the schema of each cdc-enabled table in the cluster via a Cassandra driver, and then update the cached version of the schema. Because of this implementation, if a new column is added to a table and then writes are issued against that column immediately, it is possible that data from that column will not be reflected in the CDC event. This is why it is recommened to pause for some time (configured with `schema.refresh.interval.ms`) after issuing a schema change.
#TODO: it may be possible to reactively refresh schema whenever an unexpect column appears in a mutation to improve schema change detection; worth looking into.
**TODO**: it may be possible to reactively refresh schema whenever an unexpect column appears in a mutation to improve schema change detection; worth looking into.
When sending a message to a topic t, the Avro schema for the key and the value will be automatically registered in the Confluent Schema Registry under the subject t-key and t-value, respectively, if the compatibility test passes. Although it is possible to replay a history of all table schemas via the Schema Registry, only the latest schema of each table is used to generate CDC events.
#TODO: look into if it's possible to leverage schema history to rebuild schema that exist at the specific position in the commit log, rather than the current schema, when restarting the connector. I don't think it's possible right now, because writes to Cassandra node are not received in order.
**TODO**: look into if it's possible to leverage schema history to rebuild schema that exist at the specific position in the commit log, rather than the current schema, when restarting the connector. I don't think it's possible right now, because writes to Cassandra node are not received in order.
[[events]]
=== Events
@ -178,7 +183,7 @@ All data change events produced by the Cassandra connector have a key and a valu
For a given table, the change event's key will have a structure that contains a field for each column in the primary key of the table at the time the event was created. Consider an `inventory` database with a `customers` table defined as:
[source,indent=0]
[source,sql,indent=0]
----
CREATE TABLE customers (
id bigint,
@ -440,11 +445,11 @@ The following is a JSON representation of a value schema for a _create_ event fo
}
----
#TODO: verify max timestamp != deletion timestamp in case of deletion DDLs
**TODO**: verify max timestamp != deletion timestamp in case of deletion DDLs
Given the following `insert` DML:
[source,indent=0]
[source,sql,indent=0]
----
INSERT INTO customers (
id,
@ -511,7 +516,7 @@ The value payload in JSON representation would look like this:
Given the following `update` DML:
[source,indent=0]
[source,sql,indent=0]
----
UPDATE customers
SET email = "annek_new@noanswer.org"
@ -559,6 +564,7 @@ The value payload in JSON representation would look like this:
----
When we compare this to the value in the _insert_ event, we see a couple differences:
* The `op` field value is now `u`, signifying that this row changed because of an update.
* The `after` field now has the updated state of the row, and here we can see that the email value is now `annek_new@noanswer.org`. Notice that `first_name` and `last_name` are null, this is because these fields did not change during this update. However, `id` and `registration_date` are still included, because these are the primary keys of this table.
* The `source` field structure has the same fields as before, but the values are different since this event is from a different position in the commit log.
@ -566,7 +572,7 @@ When we compare this to the value in the _insert_ event, we see a couple differe
Finally, given the following `delete` DML:
[source]
[source,sql]
----
DELETE FROM customers
WHERE id = 1001 AND registration_date = 1562202942545;
@ -609,14 +615,15 @@ The value payload in JSON representation would look like this:
----
When we compare this to the value in the _insert_ and _update_ event, we see a couple differences:
* The `op` field value is now `d`, signifying that this row changed because of a deletion.
* The `after` field only contains values for `id` and `registration_date` because this is a deletion by primary keys.
* The `source` field structure has the same fields as before, but the values are different since this event is from a different position in the commit log.
* The `ts_ms` shows the timestamp milliseconds which the connector processed this event.
#TODO: given TTL is not currently support, would it be better to remove delete_ts? would it also be okay to derive whether a field is set or not by looking at the each column to see if it's null?
**TODO**: given TTL is not currently support, would it be better to remove delete_ts? would it also be okay to derive whether a field is set or not by looking at the each column to see if it's null?
#TODO: discuss tombstone events in Cassandra connector
**TODO**: discuss tombstone events in Cassandra connector
[[data-types]]
=== Data Types
@ -625,7 +632,7 @@ As described above, the Cassandra connector represents the changes to rows with
The following table describes how the connector maps each of the Cassandra data types to an Avro type.
[cols="30%a, 30%a, 40%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a, 30%a, 40%a",width=100,options="header"]
|=======================
|Cassandra Data Type
|Literal Type (Schema Type)
@ -737,7 +744,7 @@ The following table describes how the connector maps each of the Cassandra data
|=======================
#TODO: add logical types
**TODO**: add logical types
[[when-things-go-wrong]]
=== When Things Go Wrong
@ -776,7 +783,7 @@ The Cassandra connector should be deployed each Cassandra node in a Cassandra cl
The following represents an example .properties configuration file for running and testing the Cassandra Connector locally:
[source,indent=0]
[source,properties,indent=0]
----
connector.name=test_connector
commit.log.relocation.dir=/Users/test_user/debezium-connector-cassandra/test_dir/relocation/
@ -812,7 +819,7 @@ Cassandra connector has built-in support for JMX metrics. The Cassandra driver a
[[snapshot-metrics]]
==== Snapshot Metrics
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -850,7 +857,7 @@ Cassandra connector has built-in support for JMX metrics. The Cassandra driver a
[[commitlog-metrics]]
==== Commitlog Metrics
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -878,7 +885,7 @@ Cassandra connector has built-in support for JMX metrics. The Cassandra driver a
[[connector-properties]]
=== Connector properties
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Property
|Default
@ -1001,7 +1008,7 @@ The connector also supports pass-through configuration properties that are used
For example, the follwoing connector configuration properties can be used to http://kafka.apache.org/documentation.html#security_configclients[secure connections to the Kafka broker]:
[source,indent=0]
[source,properties,indent=0]
----
kafka.producer.security.protocol=SSL
kafka.producer.ssl.keystore.location=/var/private/ssl/kafka.server.keystore.jks

View File

@ -601,7 +601,7 @@ This configuration can be sent via POST to a running Kafka Connect service, whic
The following configuration properties are _required_ unless a default value is available.
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Property
|Default
@ -702,7 +702,7 @@ Defaults to 0, which indicates that the server chooses an appropriate fetch size
The following _advanced_ configuration properties have good defaults that will work in most situations and therefore rarely need to be specified in the connector's configuration.
[cols="35%a,10%a,55%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",width=100,options="header"]
|=======================
|Property
|Default

View File

@ -7,11 +7,14 @@ include::../_attributes.adoc[]
toc::[]
[NOTE]
====
This connector is currently in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive. Please let us know if you encounter any problems.
====
Debezium's Oracle Connector can monitor and record all of the row-level changes in the databases on an Oracle server.
This connector is at an early stage of development and considered an incubating feature as of Debezium 0.8.
It is not feature-complete and the structure of emitted CDC messages may change in future revisions.
Most notably, the connector does not yet support changes to the structure of captured tables (e.g. `ALTER TABLE...`) after the initial snapshot has been completed
(see {jira-url}/browse/DBZ-718[DBZ-718], scheduled for one of the upcoming 0.9.x releases).
(see {jira-url}/browse/DBZ-718[DBZ-718]).
It is supported though to capture tables newly added while the connector is running
(provided the new table's name matches the connector's filter configuration).
@ -21,7 +24,7 @@ It is supported though to capture tables newly added while the connector is runn
As of Debezium 0.8, change events from Oracle are ingested using the https://docs.oracle.com/database/121/XSTRM/xstrm_intro.htm#XSTRM72647[XStream API].
In order to use this API and hence this connector, you need to have a license for the GoldenGate product
(though it's not required that GoldenGate itself is installed).
We are going to explore alternatives to using XStream in future Debezium 0.9.x releases, e.g. based on LogMiner and/or alternative solutions.
We are currently exploring alternatives to using XStream for a future Debezium release, e.g. based on LogMiner and/or alternative solutions.
Please track the {jira-url}/browse/DBZ-137[DBZ-137] JIRA issue and join the discussion if you are aware of potential other ways for ingesting change events from Oracle.
[[setting-up-oracle]]
@ -624,7 +627,7 @@ Please file a {jira-url}/browse/DBZ[JIRA issue] for any specific types you are m
[[character-values]]
==== Character Values
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|Oracle Data Type
|Literal type (schema type)
@ -661,7 +664,7 @@ Please file a {jira-url}/browse/DBZ[JIRA issue] for any specific types you are m
[[numeric-values]]
==== Numeric Values
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|Oracle Data Type
|Literal type (schema type)
@ -743,7 +746,7 @@ The last option for `decimal.handling.mode` configuration property is `string`.
[[temporal-values]]
==== Temporal Values
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|Oracle Data Type
|Literal type (schema type)
@ -792,7 +795,7 @@ Extract the archive into a directory, e.g. _/path/to/instant_client/.
Copy the files _ojdbc8.jar_ and _xstreams.jar_ from the Instant Client into Kafka's _libs_ directory.
Create the environment variable `LD_LIBRARY_PATH`, pointing to the Instant Client directory:
[source,indent=0]
[source,bash,indent=0]
----
LD_LIBRARY_PATH=/path/to/instant_client/
----
@ -802,7 +805,7 @@ LD_LIBRARY_PATH=/path/to/instant_client/
The following shows an example JSON request for registering an instance of the Debezium Oracle connector:
[source,indent=0]
[source,json,indent=0]
----
{
"name": "inventory-connector",
@ -835,7 +838,7 @@ Kafka, Zookeeper, and Kafka Connect all have link:/docs/monitoring/[built-in sup
===== *MBean: debezium.oracle:type=connector-metrics,context=snapshot,server=_<database.server.name>_*
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -906,7 +909,7 @@ Kafka, Zookeeper, and Kafka Connect all have link:/docs/monitoring/[built-in sup
===== *MBean: debezium.oracle:type=connector-metrics,context=streaming,server=_<database.server.name>_*
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -968,7 +971,7 @@ Kafka, Zookeeper, and Kafka Connect all have link:/docs/monitoring/[built-in sup
===== *MBean: debezium.mysql:type=connector-metrics,context=schema-history,server=_<database.server.name>_*
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -1015,7 +1018,7 @@ Kafka, Zookeeper, and Kafka Connect all have link:/docs/monitoring/[built-in sup
The following configuration properties are _required_ unless a default value is available.
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Property
|Default

View File

@ -209,8 +209,8 @@ All up-to-date differences are tracked in a test suite https://github.com/debezi
If you are using one of the supported link:#output-plugin[logical decoding plug-ins] (i.e. not pgoutput) and it has been installed,
configure the server to load the plugin at startup:
*postgresql.conf*
[source]
.postgresql.conf
[source,properties]
----
# MODULES
shared_preload_libraries = 'decoderbufs,wal2json' //<1>
@ -219,8 +219,8 @@ shared_preload_libraries = 'decoderbufs,wal2json' //<1>
Next is to configure the replication slot regardless of the decoder being used:
*postgresql.conf*
[source]
.postgresql.conf
[source,properties]
----
# REPLICATION
wal_level = logical //<1>
@ -255,7 +255,7 @@ Replication can only be performed by a database user that has appropriate permis
In order to give a user replication permissions, define a PostgreSQL role that has _at least_ the `REPLICATION` and `LOGIN` permissions. For example:
[source]
[source,sql]
----
CREATE ROLE name REPLICATION LOGIN;
----
@ -944,7 +944,7 @@ Here, the _literal type_ describes how the value is literally represented using
The _semantic type_ describes how the Kafka Connect schema captures the _meaning_ of the field using the name of the Kafka Connect schema for the field.
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1117,7 +1117,7 @@ Other data type mappings are described in the following sections.
Other than PostgreSQL's `TIMESTAMPTZ` and `TIMETZ` data types (which contain time zone information), the other temporal types depend on the value of the `time.precision.mode` configuration property. When the `time.precision.mode` configuration property is set to `adaptive` (the default), then the connector will determine the literal type and semantic type for the temporal types based on the column's data type definition so that events _exactly_ represent the values in the database:
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1153,7 +1153,7 @@ Other than PostgreSQL's `TIMESTAMPTZ` and `TIMETZ` data types (which contain tim
When the `time.precision.mode` configuration property is set to `adaptive_time_microseconds`, then the connector will determine the literal type and semantic type for the temporal types based on the column's data type definition so that events _exactly_ represent the values in the database, except that all TIME fields will be captured as microseconds:
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1184,7 +1184,7 @@ When the `time.precision.mode` configuration property is set to `adaptive_time_m
When the `time.precision.mode` configuration property is set to `connect`, then the connector will use the predefined Kafka Connect logical types. This may be useful when consumers only know about the built-in Kafka Connect logical types and are unable to handle variable-precision time values. On the other hand, since PostgreSQL supports microsecond precision, the events generated by a connector with the `connect` time precision mode will *result in a loss of precision* when the database column has a _fractional second precision_ value greater than 3:
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1223,7 +1223,7 @@ Note that the timezone of the JVM running Kafka Connect and {prodname} does not
When `decimal.handling.mode` configuration property is set to `precise`, then the connector will use the predefined Kafka Connect `org.apache.kafka.connect.data.Decimal` logical type for all `DECIMAL` and `NUMERIC` columns. This is the default mode.
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1246,7 +1246,7 @@ There is an exception to this rule.
When the `NUMERIC` or `DECIMAL` types are used without any scale constraints then it means that the values coming from the database have a different (variable) scale for each value.
In this case a type `io.debezium.data.VariableScaleDecimal` is used and it contains both value and scale of the transferred value.
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1267,7 +1267,7 @@ In this case a type `io.debezium.data.VariableScaleDecimal` is used and it conta
However, when `decimal.handling.mode` configuration property is set to `double`, then the connector will represent all `DECIMAL` and `NUMERIC` values as Java double values and encodes them as follows:
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1288,7 +1288,7 @@ However, when `decimal.handling.mode` configuration property is set to `double`,
The last option for `decimal.handling.mode` configuration property is `string`. In this case the connector will represent all `DECIMAL` and `NUMERIC` values as their formatted string representation and encodes them as follows:
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1314,7 +1314,7 @@ PostgreSQL supports `NaN` (not a number) special value to be stored in the `DECI
When `hstore.handling.mode` configuration property is set to `map`, then the connector will use the `java.util.Map<String,String>` logical type, `MAP` schema type for all `HSTORE` columns. This is the default mode.
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1330,7 +1330,7 @@ When `hstore.handling.mode` configuration property is set to `map`, then the con
However, when `hstore.handling.mode` configuration property is set to `json`, then the connector will represent all `HSTORE` values as JSON String values and encodes them as follows:
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1366,7 +1366,7 @@ When a column is defined using a domain type that extends another domain type th
PostgreSQL also have data types that can store IPv4, IPv6, and MAC addresses. It is better to use these instead of plain text types to store network addresses, because these types offer input error checking and specialized operators and functions.
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|PostgreSQL Data Type
|Literal type (schema type)
@ -1398,7 +1398,7 @@ PostgreSQL also have data types that can store IPv4, IPv6, and MAC addresses. It
The PostgreSQL connector also has full support for all of the http://postgis.net[PostGIS data types]
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|PostGIS Data Type
|Literal type (schema type)
@ -1546,7 +1546,7 @@ This configuration can be sent via POST to a running Kafka Connect service, whic
The following configuration properties are _required_ unless a default value is available.
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Property
|Default
@ -1708,7 +1708,7 @@ Fully-qualified tables could be defined as `DB_NAME.TABLE_NAME` or `SCHEMA_NAME.
The following _advanced_ configuration properties have good defaults that will work in most situations and therefore rarely need to be specified in the connector's configuration.
[cols="35%a,10%a,55%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",width=100,options="header"]
|=======================
|Property
|Default

View File

@ -680,12 +680,13 @@ GO
----
Kafka Connect log will contain messages like these:
```
[source,shell]
----
connect_1 | 2019-01-17 10:11:14,924 INFO || Multiple capture instances present for the same table: Capture instance "dbo_customers" [sourceTableId=testDB.dbo.customers, changeTableId=testDB.cdc.dbo_customers_CT, startLsn=00000024:00000d98:0036, changeTableObjectId=1525580473, stopLsn=00000025:00000ef8:0048] and Capture instance "dbo_customers_v2" [sourceTableId=testDB.dbo.customers, changeTableId=testDB.cdc.dbo_customers_v2_CT, startLsn=00000025:00000ef8:0048, changeTableObjectId=1749581271, stopLsn=NULL] [io.debezium.connector.sqlserver.SqlServerStreamingChangeEventSource]
connect_1 | 2019-01-17 10:11:14,924 INFO || Schema will be changed for ChangeTable [captureInstance=dbo_customers_v2, sourceTableId=testDB.dbo.customers, changeTableId=testDB.cdc.dbo_customers_v2_CT, startLsn=00000025:00000ef8:0048, changeTableObjectId=1749581271, stopLsn=NULL] [io.debezium.connector.sqlserver.SqlServerStreamingChangeEventSource]
...
connect_1 | 2019-01-17 10:11:33,719 INFO || Migrating schema to ChangeTable [captureInstance=dbo_customers_v2, sourceTableId=testDB.dbo.customers, changeTableId=testDB.cdc.dbo_customers_v2_CT, startLsn=00000025:00000ef8:0048, changeTableObjectId=1749581271, stopLsn=NULL] [io.debezium.connector.sqlserver.SqlServerStreamingChangeEventSource]
```
----
Eventually, there is a new field in the schema and value of the messages written to the Kafka topic.
[source,json]
@ -723,7 +724,7 @@ The following table describes how the connector maps each of the SQL Server data
Here, the _literal type_ describes how the value is literally represented using Kafka Connect schema types, namely `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT32`, `FLOAT64`, `BOOLEAN`, `STRING`, `BYTES`, `ARRAY`, `MAP`, and `STRUCT`.
The _semantic type_ describes how the Kafka Connect schema captures the _meaning_ of the field using the name of the Kafka Connect schema for the field.
[cols="20%a,15%a,30%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=100,options="header"]
|=======================
|SQL Server Data Type
|Literal type (schema type)
@ -821,7 +822,7 @@ endif::cdc-product[]
Other than SQL Server's `DATETIMEOFFSET` data type (which contain time zone information), the other temporal types depend on the value of the `time.precision.mode` configuration property. When the `time.precision.mode` configuration property is set to `adaptive` (the default), then the connector will determine the literal type and semantic type for the temporal types based on the column's data type definition so that events _exactly_ represent the values in the database:
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|SQL Server Data Type
|Literal type (schema type)
@ -877,7 +878,7 @@ Other than SQL Server's `DATETIMEOFFSET` data type (which contain time zone info
When the `time.precision.mode` configuration property is set to `connect`, then the connector will use the predefined Kafka Connect logical types. This may be useful when consumers only know about the built-in Kafka Connect logical types and are unable to handle variable-precision time values. On the other hand, since SQL Server supports tenth of microsecond precision, the events generated by a connector with the `connect` time precision mode will *result in a loss of precision* when the database column has a _fractional second precision_ value greater than 3:
[cols="20%a,15%a,30%a,35%a",width=150,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="20%a,15%a,30%a,35%a",width=150,options="header"]
|=======================
|SQL Server Data Type
|Literal type (schema type)
@ -922,7 +923,7 @@ Note that the timezone of the JVM running Kafka Connect and {prodname} does not
==== Decimal values
[cols="15%a,15%a,35%a,35%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col3 code-wordbreak-col4"]
[cols="15%a,15%a,35%a,35%a",width=100,options="header"]
|=======================
|SQL Server Data Type
|Literal type (schema type)
@ -1062,7 +1063,7 @@ Kafka, Zookeeper, and Kafka Connect all have built-in support for JMX metrics. T
===== *MBean: debezium.sql_server:type=connector-metrics,context=snapshot,server=_<database.server.name>_*
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -1133,7 +1134,7 @@ Kafka, Zookeeper, and Kafka Connect all have built-in support for JMX metrics. T
===== *MBean: debezium.sql_server:type=connector-metrics,context=streaming,server=_<database.server.name>_*
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -1195,7 +1196,7 @@ Kafka, Zookeeper, and Kafka Connect all have built-in support for JMX metrics. T
===== *MBean: debezium.sql_server:type=connector-metrics,context=schema-history,server=_<database.server.name>_*
[cols="30%a,10%a,60%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="30%a,10%a,60%a",width=100,options="header"]
|=======================
|Attribute Name
|Type
@ -1242,7 +1243,7 @@ Kafka, Zookeeper, and Kafka Connect all have built-in support for JMX metrics. T
The following configuration properties are _required_ unless a default value is available.
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Property
|Default
@ -1335,7 +1336,7 @@ Fully-qualified tables could be defined as `DB_NAME.TABLE_NAME` or `SCHEMA_NAME.
The following _advanced_ configuration properties have good defaults that will work in most situations and therefore rarely need to be specified in the connector's configuration.
[cols="35%a,10%a,55%a",width=100,options="header,footer",role="table table-bordered table-striped code-wordbreak-col"]
[cols="35%a,10%a,55%a",width=100,options="header"]
|=======================
|Property
|Default
@ -1443,7 +1444,7 @@ For example, the following connector configuration properties can be used to htt
In addition to the _pass-through_ to the Kafka producer and consumer, the properties starting with `database.`, e.g. `database.applicationName=debezium` are passed to the JDBC URL.
[source,indent=0]
[source,properties,indent=0]
----
database.history.producer.security.protocol=SSL
database.history.producer.ssl.keystore.location=/var/private/ssl/kafka.server.keystore.jks

View File

@ -11,7 +11,7 @@ toc::[]
[NOTE]
====
Support for CloudEvents currently is in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive.
Please let us know or your specific requirements or if you encounter any problems will using this feature.
Please let us know or your specific requirements or if you encounter any problems while using this feature.
====
https://cloudevents.io/[CloudEvents] is a "specification for describing event data in a common way",
@ -111,7 +111,7 @@ Finally, it's also possible to use Avro for the entire envelope as well as the `
The following configuration options exist when usi
[cols="35%a,10%a,55%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",width=100,options="header"]
|=======================
|Property
|Default

View File

@ -10,7 +10,7 @@ toc::[]
[NOTE]
====
This feature is currently in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive. Please let us know if you encounter any problems will using this extension.
This feature is currently in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive. Please let us know if you encounter any problems while using this extension.
====
== Overview
@ -48,6 +48,8 @@ It's important that for a given Quarkus application, *all* implementations of th
== Example
The following illustrates an implementation of the `ExportedEvent` interface representing an order that has been created:
.OrderCreatedEvent.java
[source,java,indent=0]
----
public class OrderCreatedEvent implements ExportedEvent<String, JsonNode> {
@ -93,6 +95,8 @@ public class OrderCreatedEvent implements ExportedEvent<String, JsonNode> {
----
The following example illustrates an `OrderService` that emits the `OrderCreatedEvent`:
.OrderService.java
[source,java,indent=0]
----
@ApplicationScoped
@ -121,7 +125,7 @@ The extension works out-of-the-box with a default configuration, but this config
=== Build time configuration options
[cols="65%a,>15%a,>20%",width=100,options="header,footer"]
[cols="65%a,>15%a,>20%",width=100,options="header"]
|=======================
|Configuration property
|Type
@ -173,7 +177,7 @@ When not using the default values, be sure that the SMT configuration matches.
=== Runtime configuration options
[cols="65%a,>15%a,>20%",width=100,options="header,footer"]
[cols="65%a,>15%a,>20%",width=100,options="header"]
|=======================
|Configuration property
|Type

View File

@ -10,7 +10,7 @@ toc::[]
[NOTE]
====
This feature is currently in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive. Please let us know if you encounter any problems will using these SerDes.
This feature is currently in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive. Please let us know if you encounter any problems while using these SerDes.
====
Debezium generates data change events in the form of a complex message structure.
@ -93,7 +93,7 @@ The deserializer behaviour is driven by the `from.field` configuration option an
[[configuration_options]]
=== Configuration options
[cols="35%a,10%a,55%a",width=100,options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",width=100,options="header"]
|=======================
|Property
|Default

View File

@ -248,7 +248,7 @@ EmbeddedEngine engine = EmbeddedEngine.create()
The following configuration properties are _required_ unless a default value is available (for the sake of text formatting the package names of Java classes are replaced with `<...>`).
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Property
|Default

View File

@ -32,7 +32,8 @@ To configure logging, you specify the desired level for each logger and the appe
If you're running Debezium connectors in a Kafka Connect process, then Kafka Connect will use the Log4J configuration file (e.g., `config/connect-log4j.properties`) in the Kafka installation. The following are snippets from this file:
[source]
.log4j.properties
[source,properties]
----
log4j.rootLogger=INFO, stdout <1>
@ -54,7 +55,8 @@ For the most part, the Debezium code sends its log messages to loggers with name
This means that you can easily control all of the log messages for a specific class or for all of the classes within or under a specific package. For example, to turn on debug logging for the _entire_ MySQL connector (and the database history implementation used by the connector) might be as simple as adding the following line(s) to your `log4j.properties` file:
[listing,indent=0,options="nowrap"]
.log4j.properties
[listing,properties,indent=0,options="nowrap"]
----
...
log4j.logger.io.debezium.connector.mysql=DEBUG, stdout <1>
@ -72,7 +74,8 @@ Configuring logging is a tradeoff: provide too little and it's not clear what is
You can also configure logging for a specific subset of the classes within the connector, simply by adding more lines like those above except for more specific logger names. For example, maybe you're not sure why the MySQL connector is skipping some events when it is processing the binlog. Rather than turn on `DEBUG` or `TRACE` logging for whole connector, you can set the connector's logging to `INFO` and then configure `DEBUG` or `TRACE` on just the class that's reading the binlog. For example:
[listing,indent=0,options="nowrap"]
.log4j.properties
[listing,properties,indent=0,options="nowrap"]
----
...
log4j.logger.io.debezium.connector.mysql=INFO, stdout <1>
@ -99,7 +102,8 @@ Most Debezium connectors use multiple threads to perform different activities, a
You can use these properties within the appender's pattern defined in the `log4j.properties` file. For example, the following is a modification of the `stdout` appender's layout to use these MDC properties:
[listing,indent=0,options="nowrap"]
.log4j.properties
[listing,properties,indent=0,options="nowrap"]
----
...
log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p %X{dbz.connectorType}|%X{dbz.connectorName}|%X{dbz.connectorContext} %m [%c]%n
@ -108,7 +112,7 @@ log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p %X{dbz.connecto
This will produce messages in the log similar to these:
[listing,indent=0,options="nowrap"]
[listing,shell,indent=0,options="nowrap"]
----
...
2017-02-07 20:49:37,692 INFO MySQL|dbserver1|snapshot Starting snapshot for jdbc:mysql://mysql:3306/?useInformationSchema=true&nullCatalogMeansCurrent=false&useSSL=false&useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8&zeroDateTimeBehavior=convertToNull with user 'debezium' [io.debezium.connector.mysql.SnapshotReader]
@ -128,7 +132,8 @@ The containers use a `LOG_LEVEL` environment variable to set the log level for t
If you need more control, create a new image that is based on ours, except in your `Dockerfile` copy your own `log4j.properties` file into the image:
[listing,indent=0,options="nowrap"]
.Dockerfile
[listing,dockerfile,indent=0,options="nowrap"]
----
...
COPY log4j.properties $KAFKA_HOME/config/log4j.properties

View File

@ -23,7 +23,7 @@ JMX can be enabled in Zookeeper, Kafka, and Kafka Connect using their standard i
Zookeeper has built-in support for JMX. When running Zookeeper using a local installation, the `zkServer.sh` script recognizes the following environment variables:
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Environment Variable
|Default
@ -51,7 +51,7 @@ Zookeeper has built-in support for JMX. When running Zookeeper using a local ins
When running Kafka using a local installation, the `kafka-server-start.sh` script recognizes the following environment variables:
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Environment Variable
|Default
@ -72,7 +72,7 @@ When running Kafka using a local installation, the `kafka-server-start.sh` scrip
When running Kafka using a local installation, the `connect-distributed.sh` script recognizes the following environment variables:
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Environment Variable
|Default
@ -96,7 +96,7 @@ Enable JMX for a JVM running in a Docker container requires several additional o
The `debezium/zookeeper` image recognizes the following JMX-related environment variables:
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Environment Variable
|Default
@ -125,15 +125,16 @@ The `debezium/zookeeper` image recognizes the following JMX-related environment
The following example Docker command start a container using the `debezium/zookeeper` image with values for the `JMXPORT` and `JMXHOST` environment variables, and maps the Docker host's port 9010 to the container's JMX port:
```
[source,shell]
----
docker run -it --rm --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 -p 9010:9010 -e JMXPORT=9010 -e JMXHOST=10.0.1.10 debezium/zookeeper:latest
```
----
=== Kafka in Docker
The `debezium/kafka` image recognizes the following JMX-related environment variables:
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Environment Variable
|Default
@ -159,15 +160,16 @@ The `debezium/kafka` image recognizes the following JMX-related environment vari
The following example Docker command start a container using the `debezium/kafka` image with values for the `JMXPORT` and `HOST_NAME` environment variables, and maps the Docker host's port 9011 to the container's JMX port:
```
[source,shell]
----
docker run -it --rm --name kafka -p 9092:9092 -p 9011:9011 -e JMXPORT=9011 -e JMXHOST=10.0.1.10 --link zookeeper:zookeeper debezium/kafka:latest
```
----
=== Kafka Connect in Docker
The `debezium/connect` image recognizes the following JMX-related environment variables:
[cols="35%a,10%a,55%a",options="header,footer",role="table table-bordered table-striped"]
[cols="35%a,10%a,55%a",options="header"]
|=======================
|Environment Variable
|Default
@ -194,9 +196,10 @@ The following example Docker command start a container using the `debezium/conne
The Docker command to start a container using the `debezium/connect` image defines these variables using Docker's standard `-e` parameter, and maps the JMX port to a port on the Docker host. For example, the following command starts a container with JMX exposed on port 9011:
```
[source,shell]
----
docker run -it --rm --name connect -p 8083:8083 -p 9012:9012 -e JMXPORT=9012 -e JMXHOST=10.0.1.10 -e GROUP_ID=1 -e CONFIG_STORAGE_TOPIC=my_connect_configs -e OFFSET_STORAGE_TOPIC=my_connect_offsets --link zookeeper:zookeeper --link kafka:kafka --link mysql:mysql debezium/connect:latest
```
----
== Prometheus and Grafana

View File

@ -24,7 +24,7 @@ It consists of enterprise grade configuration files and images that bring Kafka
First we install the operators and templates for the Kafka broker and Kafka Connect into our OpenShift project:
[listing,subs="attributes",options="nowrap"]
[source,shell,subs="attributes",options="nowrap"]
----
export STRIMZI_VERSION={strimzi-version}
git clone -b $STRIMZI_VERSION https://github.com/strimzi/strimzi-kafka-operator
@ -37,7 +37,7 @@ oc create -f install/cluster-operator && oc create -f examples/templates/cluster
Next we will deploy a Kafka broker cluster and a Kafka Connect cluster and then create a Kafka Connect image with the Debezium connectors installed:
[listing,subs="attributes",options="nowrap"]
[source,shell,subs="attributes",options="nowrap"]
----
# Deploy an ephemeral single instance Kafka broker
oc process strimzi-ephemeral -p CLUSTER_NAME=broker -p ZOOKEEPER_NODE_COUNT=1 -p KAFKA_NODE_COUNT=1 -p KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 -p KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1 | oc apply -f -