This document describes the database setup required for streaming data changes out of https://www.postgresql.org/[PostgreSQL].
This comprises configuration applying to the database itself as well as the installation of the https://github.com/eulerto/wal2json[wal2json] logical decoding output plug-in.
The installation and the tests are performed at the following environment/configuration:
This means that a logical decoding output plug-in is no longer necessary and changes can be emitted directly from the replication stream by the connector.
====
[[logical-decoding-plugin-setup]]
== Logical Decoding Plug-ins
Logical decoding is the process of extracting all persistent changes to a database's tables into a coherent, easy to understand format
which can be interpreted without detailed knowledge of the database's internal state.
As of PostgreSQL 9.4, logical decoding is implemented by decoding the contents of the write-ahead log, which describe changes
on a storage level, into an application-specific form such as a stream of tuples or SQL statements.
In the context of logical replication, a slot represents a stream of changes that can be replayed to a client in the order
they were made on the origin server. Each slot streams a sequence of changes from a single database.
The output plug-ins transform the data from the write-ahead log's internal representation into the format the consumer
of a replication slot desires. Plug-ins are written in C, compiled, and installed on the machine which runs the PostgreSQL server,
and they use a number of PostgreSQL specific APIs, as described by the
For simplicity, {prodname} also provides a container image based on a vanilla https://github.com/debezium/docker-images/tree/main/postgres/9.6[PostgreSQL server image]
<1> tells the server that it should load at startup the `wal2json` (use `decoderbufs` for https://github.com/google/protobuf[protobuf]) logical decoding plug-in(s)
(the names of the plug-ins are set in https://github.com/debezium/postgres-decoderbufs/blob/v{debezium-version}/Makefile[protobuf]
and https://github.com/eulerto/wal2json/blob/master/Makefile[wal2json] Makefiles)
<2> tells the server that it should use logical decoding with the write-ahead log
<3> tells the server that it should use a maximum of `4` separate processes for processing WAL changes
<4> tells the server that it should allow a maximum of `4` replication slots to be created for streaming WAL changes
{prodname} uses PostgreSQL's logical decoding, which uses replication slots. Replication slots are guaranteed to retain all WAL required for {prodname} even during {prodname} outages. It is important for this reason to closely monitor replication slots to avoid too much disk consumption and other conditions that can happen such as catalog bloat if a {prodname} slot stays unused for too long. For more information please see the official Postgres docs on https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS[this subject].
We strongly recommend reading and understanding https://www.postgresql.org/docs/9.6/static/wal-configuration.html[the official documentation] regarding the mechanics and configuration of the PostgreSQL write-ahead log.
====
[discrete]
[[setting_replication_permissions]]
=== _Setting up replication permissions_
Replication can only be performed by a database user that has appropriate permissions and only for a configured number of hosts.
In order to give a user replication permissions, define a PostgreSQL role that has _at least_ the `REPLICATION` and `LOGIN` permissions.
For example:
[source,sql]
----
CREATE ROLE name REPLICATION LOGIN;
----
[TIP]
====
Superusers have by default both of the above roles.
====
Add the following lines at the end of the `pg_hba.conf` PostgreSQL configuration file, so as to configure the
https://www.postgresql.org/docs/9.6/static/auth-pg-hba-conf.html[client authentication] for the database replication.
The PostgreSQL server should allow replication to take place between the server machine and the host on which the
Before starting make sure that you have logged in as a user with database replication permissions, as configured at a xref:{link-postgresql-plugins}#setting_replication_permissions[previous step].
is a PostgreSQL specific table-level setting which determines the amount of information that is available
to logical decoding in case of `UPDATE` and `DELETE` events.
There are 4 possible values for `REPLICA IDENTITY`:
* *DEFAULT* - `UPDATE` and `DELETE` events will only contain the previous values for the primary key columns of a table
* *NOTHING* - `UPDATE` and `DELETE` events will not contain any information about the previous value on any of the table columns
* *FULL* - `UPDATE` and `DELETE` events will contain the previous values of all the table's columns
* *INDEX* `index name` - `UPDATE` and `DELETE` events will contains the previous values of the columns contained in the index definition named `index name`
You can modify and check the replica `REPLICA IDENTITY` for a table with the following commands: