From 8bf4a85a8158b3cc21dc0ed60c7cde2fdf37f09c Mon Sep 17 00:00:00 2001 From: Gunnar Morling Date: Wed, 3 Feb 2021 13:52:28 +0100 Subject: [PATCH] DBZ-2382 Doc update --- .../ROOT/pages/connectors/postgresql.adoc | 95 +++++++++++++++++-- 1 file changed, 88 insertions(+), 7 deletions(-) diff --git a/documentation/modules/ROOT/pages/connectors/postgresql.adoc b/documentation/modules/ROOT/pages/connectors/postgresql.adoc index 31ad9d35a..f8c25a074 100644 --- a/documentation/modules/ROOT/pages/connectors/postgresql.adoc +++ b/documentation/modules/ROOT/pages/connectors/postgresql.adoc @@ -933,7 +933,7 @@ The value of a change event for an update in the sample `customers` table has th "connector": "postgresql", "name": "PostgreSQL_server", "ts_ms": 1559033904863, - "snapshot": null, + "snapshot": false, "db": "postgres", "schema": "public", "table": "customers", @@ -970,7 +970,7 @@ a|Mandatory field that describes the source metadata for the event. The `source` * Connector type and name * Database and table that contains the new row * Schema name -* If the event was part of a snapshot +* If the event was part of a snapshot (alwas `false` for _update_ events) * ID of the transaction in which the operation was performed * Offset of the operation in the database log * Timestamp for when the change was made in the database @@ -1007,7 +1007,7 @@ as a primary key change. For a primary key change, in place of sending an `UPDAT [[postgresql-delete-events]] === _delete_ events -The value in a _delete_ change event has the same `schema` portion as _create_ and _update_ events for the same table. The `payload` portion in a _delete_ event for the sample `customers` table looks like this: +The value in a _delete_ change event has the same `schema` portion as _create_ and _update_ events for the same table. The `payload` portion in a _delete_ event for the sample `customers` table looks like this: [source,json,indent=0,subs="+attributes"] ---- @@ -1023,7 +1023,7 @@ The value in a _delete_ change event has the same `schema` portion as _create_ a "connector": "postgresql", "name": "PostgreSQL_server", "ts_ms": 1559033904863, - "snapshot": null, + "snapshot": false, "db": "postgres", "schema": "public", "table": "customers", @@ -1058,9 +1058,9 @@ a|Mandatory field that describes the source metadata for the event. In a _delete * {prodname} version * Connector type and name -* Database and table that contains the new row +* Database and table that contained the deleted row * Schema name -* If the event was part of a snapshot +* If the event was part of a snapshot (alwas `false` for _delete_ events) * ID of the transaction in which the operation was performed * Offset of the operation in the database log * Timestamp for when the change was made in the database @@ -1091,6 +1091,76 @@ PostgreSQL connector events are designed to work with link:{link-kafka-docs}#com .Tombstone events When a row is deleted, the _delete_ event value still works with log compaction, because Kafka can remove all earlier messages that have that same key. However, for Kafka to remove all messages that have that same key, the message value must be `null`. To make this possible, the PostgreSQL connector follows a _delete_ event with a special _tombstone_ event that has the same key but a `null` value. +// Type: continue +[[postgresql-truncate-events]] +=== _truncate_ events + +A _truncate_ change event signals that a table has been truncated. +The message key is `null` in this case, the message value looks like this: + +[source,json,indent=0,subs="+attributes"] +---- +{ + "schema": { ... }, + "payload": { + "source": { // <1> + "version": "{debezium-version}", + "connector": "postgresql", + "name": "PostgreSQL_server", + "ts_ms": 1559033904863, + "snapshot": false, + "db": "postgres", + "schema": "public", + "table": "customers", + "txId": 556, + "lsn": 46523128, + "xmin": null + }, + "op": "t", // <2> + "ts_ms": 1559033904961 // <3> + } +} +---- + +.Descriptions of _truncate_ event value fields +[cols="1,2,7",options="header"] +|=== +|Item |Field name |Description + +|1 +|`source` +a|Mandatory field that describes the source metadata for the event. In a _truncate_ event value, the `source` field structure is the same as for _create_, _update_, and _delete_ events for the same table, provides this metadata: + +* {prodname} version +* Connector type and name +* Database and table that contains the new row +* Schema name +* If the event was part of a snapshot (alwas `false` for _delete_ events) +* ID of the transaction in which the operation was performed +* Offset of the operation in the database log +* Timestamp for when the change was made in the database + +|2 +|`op` +a|Mandatory string that describes the type of operation. The `op` field value is `t`, signifying that this table was truncated. + +|3 +|`ts_ms` +a|Optional field that displays the time at which the connector processed the event. The time is based on the system clock in the JVM running the Kafka Connect task. + + + +In the `source` object, `ts_ms` indicates the time that the change was made in the database. By comparing the value for `payload.source.ts_ms` with the value for `payload.ts_ms`, you can determine the lag between the source database update and {prodname}. + +|=== + +In case a single `TRUNCATE` statement applies to multiple tables, +one _truncate_ change event record for each truncated table will be emitted. + +Note that since _truncate_ events represent a change made to an entire table and don't have a message key, +unless you're working with topics with a single partition, +there are no ordering guarantees for the change events pertaining to a table (_create_, _update_, etc.) and _truncate_ events for that table. +For instance a consumer may receive an _update_ event only after a _truncate_ event for that table, +when those events are read from different partitions. + // Type: reference // ModuleID: how-debezium-postgresql-connectors-map-data-types // Title: How {prodname} PostgreSQL connectors map data types @@ -2483,7 +2553,18 @@ If `table_a` has a an `id` column, and `regex_1` is `^i` (matches any column tha + `base64` represents binary data as base64-encoded strings. + + -`hex` represents binary data as hex-encoded (base16) strings. + +`hex` represents binary data as hex-encoded (base16) strings. + +|[[postgresql-property-truncat-handling-mode]]<> +|bytes +|Specifies how whether `TRUNCATE` events should be propagated or not (only available when using the `pgoutput` plug-in with Postgres 11 or later): + + + +`skip` causes those event to be omitted (the default). + + + +`include` causes hos events to be included. + ++ +Please see xref:postgresql-truncate-events[] for the structure of _truncate_ events and their ordering semantics. + |=== [id="postgresql-advanced-configuration-properties"]