[id="logical-decoding-output-plugin-installation-for-postgresql"] = Logical Decoding Output Plug-in Installation for PostgreSQL :toc: :toc-placement: macro :linkattrs: :icons: font :source-highlighter: highlight.js toc::[] This document describes the database setup required for streaming data changes out of https://www.postgresql.org/[PostgreSQL]. This comprises configuration applying to the database itself as well as the installation of the https://github.com/debezium/postgres-decoderbufs[decoderbufs] logical decoding output plug-in. The installation and the tests are performed at the following environment/configuration: * https://www.postgresql.org/docs/14/index.html[PostgreSQL (v14.x)] * https://github.com/debezium/postgres-decoderbufs[decoderbufs] * https://www.centos.org/[CentOS Stream 8] [NOTE] ==== As of {prodname} 0.10, the connector supports PostgreSQL 10+ logical replication streaming using _pgoutput_. This means that a logical decoding output plug-in is no longer necessary and changes can be emitted directly from the replication stream by the connector. ==== [[logical-decoding-plugin-setup]] == Logical Decoding Plug-ins Logical decoding is the process of extracting all persistent changes to a database's tables into a coherent, easy to understand format which can be interpreted without detailed knowledge of the database's internal state. As of PostgreSQL 9.4, logical decoding is implemented by decoding the contents of the write-ahead log, which describe changes on a storage level, into an application-specific form such as a stream of tuples or SQL statements. In the context of logical replication, a slot represents a stream of changes that can be replayed to a client in the order they were made on the origin server. Each slot streams a sequence of changes from a single database. The output plug-ins transform the data from the write-ahead log's internal representation into the format the consumer of a replication slot desires. Plug-ins are written in C, compiled, and installed on the machine which runs the PostgreSQL server, and they use a number of PostgreSQL specific APIs, as described by the https://www.postgresql.org/docs/14/logicaldecoding-output-plugin.html[PostgreSQL documentation]. {prodname}’s PostgreSQL connector works with one of {prodname}’s supported logical decoding plug-ins, * https://github.com/debezium/postgres-decoderbufs/blob/main/README.md[decoderbufs] or * https://github.com/postgres/postgres/blob/master/src/backend/replication/pgoutput/pgoutput.c[pgoutput] to encode the changes in either https://github.com/google/protobuf[Protobuf] format or https://www.postgresql.org/docs/14/protocol-logicalrep-message-formats.htmllLogical replication] format. [TIP] ==== For simplicity, {prodname} also provides a container image based on a vanilla https://github.com/debezium/container-images/tree/main/postgres/14[PostgreSQL server image] on top of which it compiles and installs the plug-ins. ==== [WARNING] ==== The {prodname} logical decoding plug-ins have only been installed and tested on _Linux_ machines. For Windows and other platforms it may require different installation steps ==== [discrete] ==== Differences between Plug-ins All up-to-date differences are tracked in a test suite https://github.com/debezium/debezium/blob/main/debezium-connector-postgres/src/test/java/io/debezium/connector/postgresql/DecoderDifferences.java[Java class]. More information about the logical decoding and output plug-ins can be found at: * https://www.postgresql.org/docs/14/logicaldecoding-explanation.html[PostgreSQL logical decoding explanation] * https://www.postgresql.org/docs/14/logicaldecoding-output-plugin.html[PostgreSQL logical decoding output plug-in] * https://wiki.postgresql.org/wiki/Logical_Decoding_Plugins[PostgreSQL logical decoding plug-ins] [[logical-decoding-output-plugin-installation]] === Installation At the current installation example, the https://github.com/debezium/postgres-decoderbufs[decoderbufs] output plug-in for logical decoding is used. The decoderbufs output plug-in produces a Protobuf message per database change. Each message contains new/old tuples for an updated table row.. The plug-in *compilation and installation* is performed by executing the related commands extracted from the https://github.com/debezium/container-images/blob/main/postgres/14/Dockerfile[{prodname} Dockerfile]. Before executing the commands, make sure that the user has the privileges to write the `decoderbufs` library at the PostgreSQL `_lib_` directory (at the test environment, the directory is: `/usr/lib64/pgsql/`). Also note that the installation process requires the PostgreSQL utility https://www.postgresql.org/docs/11/app-pgconfig.html[pg_config]. Verify that the `PATH` environment variable is set so as the utility can be found. If not, update the `PATH` environment variable appropriately. For example at the test environment: .*decoderbufs* installation commands [source,bash] ---- $ git clone https://github.com/debezium/postgres-decoderbufs -b v{debezium-version} --single-branch \ && cd postgres-decoderbufs \ && make && make install \ && cd .. \ && rm -rf postgres-decoderbufs ---- .*decoderbufs* installation output [source,bash] ---- Cloning into 'postgres-decoderbufs'... remote: Enumerating objects: 288, done. remote: Counting objects: 100% (4/4), done. remote: Compressing objects: 100% (4/4), done. remote: Total 288 (delta 0), reused 1 (delta 0), pack-reused 284 Receiving objects: 100% (288/288), 91.62 KiB | 3.66 MiB/s, done. Resolving deltas: 100% (131/131), done. Note: switching to 'c9b00aa8c093fa77e08b256bb09d33069a30db86'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fPIC -std=c11 -I/usr/local/include -I. -I./ -I/usr/include/pgsql/server -I/usr/include/pgsql/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o src/decoderbufs.o src/decoderbufs.c gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fPIC -std=c11 -I/usr/local/include -I. -I./ -I/usr/include/pgsql/server -I/usr/include/pgsql/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o src/proto/pg_logicaldec.pb-c.o src/proto/pg_logicaldec.pb-c.c gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fPIC -shared -o decoderbufs.so src/decoderbufs.o src/proto/pg_logicaldec.pb-c.o -L/usr/lib64 -Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,--as-needed -lprotobuf-c /usr/bin/mkdir -p '/usr/lib64/pgsql' /usr/bin/mkdir -p '/usr/share/pgsql/extension' /usr/bin/install -c -m 755 decoderbufs.so '/usr/lib64/pgsql/decoderbufs.so' /usr/bin/install -c -m 644 .//decoderbufs.control '/usr/share/pgsql/extension/' ---- [[fedora-rpm]] === Installation on Fedora 30+ {prodname} provides https://apps.fedoraproject.org/packages/postgres-decoderbufs[RPM package] for Fedora operating system too. The package is updated always after a final {prodname} release is done. To use the RPM in question just issue the standard Fedora installation command: [source,bash] ---- $ sudo dnf -y install postgres-decoderbufs ---- The rest of the configuration is same as described below. [[postgresql-server-configuration]] === PostgreSQL Server Configuration Once the *decoderbufs* plug-in has been installed, the database server should be configured. [discrete] === _Setting up libraries, WAL and replication parameters_ Add the following lines at the end of the `postgresql.conf` PostgreSQL configuration file in order to include the plug-in at the shared libraries and to adjust some https://www.postgresql.org/docs/14/runtime-config-wal.html[WAL] and https://www.postgresql.org/docs/14/runtime-config-replication.html[streaming replication] settings. The configuration is extracted from https://github.com/debezium/container-images/blob/main/postgres/14/postgresql.conf.sample[postgresql.conf.sample]. You may need to modify it, if for example you have additionally installed `shared_preload_libraries`. .*_postgresql.conf_* _, configuration file parameters settings_ [source] ---- ############ REPLICATION ############## # MODULES shared_preload_libraries = 'decoderbufs' //<1> # REPLICATION wal_level = logical //<2> max_wal_senders = 4 //<3> max_replication_slots = 4 //<4> ---- <1> tells the server that it should load at startup the `decoderbufs` (the name of the plug-in is set in https://github.com/debezium/postgres-decoderbufs/blob/main/Makefile[decoderbufs] Makefile) <2> tells the server that it should use logical decoding with the write-ahead log <3> tells the server that it should use a maximum of `4` separate processes for processing WAL changes <4> tells the server that it should allow a maximum of `4` replication slots to be created for streaming WAL changes {prodname} uses PostgreSQL's logical decoding, which uses replication slots. Replication slots are guaranteed to retain all WAL required for {prodname} even during {prodname} outages. It is important for this reason to closely monitor replication slots to avoid too much disk consumption and other conditions that can happen such as catalog bloat if a {prodname} slot stays unused for too long. For more information please see the official Postgres docs on https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS[this subject]. [TIP] ==== We strongly recommend reading and understanding https://www.postgresql.org/docs/14/wal-configuration.html[the official documentation] regarding the mechanics and configuration of the PostgreSQL write-ahead log. ==== [discrete] [[setting_replication_permissions]] === _Setting up replication permissions_ Replication can only be performed by a database user that has appropriate permissions and only for a configured number of hosts. In order to give a user replication permissions, define a PostgreSQL role that has _at least_ the `REPLICATION` and `LOGIN` permissions. For example: [source,sql] ---- CREATE ROLE name REPLICATION LOGIN; ---- [TIP] ==== Superusers have by default both of the above roles. ==== Add the following lines at the end of the `pg_hba.conf` PostgreSQL configuration file, so as to configure the https://www.postgresql.org/docs/14/auth-pg-hba-conf.html[client authentication] for the database replication. The PostgreSQL server should allow replication to take place between the server machine and the host on which the {prodname} PostgreSQL connector is running. Note that the authentication refers to the database superuser `postgres`. You may change this accordingly, if some other user with `REPLICATION` and `LOGIN` permissions has been created. [[pg_hba_conf]] .*_pg_hba.conf_* _, configuration file parameters settings_ [source] ---- ############ REPLICATION ############## local replication postgres trust //<1> host replication postgres 127.0.0.1/32 trust //<2> host replication postgres ::1/128 trust //<3> ---- <1> tells the server to allow replication for `postgres` locally (i.e. on the server machine) <2> tells the server to allow `postgres` on `localhost` to receive replication changes using `IPV4` <3> tells the server to allow `postgres` on `localhost` to receive replication changes using `IPV6` [TIP] ==== See https://www.postgresql.org/docs/14/datatype-net-types.html[the PostgreSQL documentation] for more information on network masks. ====