From 4a100ea7d642b515fc904b385b65549b04110b63 Mon Sep 17 00:00:00 2001 From: Bob Roldan Date: Wed, 10 Mar 2021 21:37:08 -0500 Subject: [PATCH] DBZ-2122 Add SQL Server topic; edits to Db2 topic --- .../modules/ROOT/pages/connectors/db2.adoc | 70 +++++++++++++------ .../ROOT/pages/connectors/sqlserver.adoc | 66 +++++++++++++++-- 2 files changed, 110 insertions(+), 26 deletions(-) diff --git a/documentation/modules/ROOT/pages/connectors/db2.adoc b/documentation/modules/ROOT/pages/connectors/db2.adoc index c75dedb0c..fd9c61074 100644 --- a/documentation/modules/ROOT/pages/connectors/db2.adoc +++ b/documentation/modules/ROOT/pages/connectors/db2.adoc @@ -1296,19 +1296,38 @@ The `scale` schema parameter contains an integer that represents how many digits The `connect.decimal.precision` schema parameter contains an integer that represents the precision of the given decimal value. |=== -// Type: procedure +// Type: assembly // ModuleID: setting-up-db2-to-run-a-debezium-connector // Title: Setting up Db2 to run a {prodname} connector [[setting-up-db2]] -== Set up +== Setting up Db2 -A database administrator must put tables into capture mode before you can run a {prodname} Db2 connector to capture changes that are committed to a Db2 database. To put tables into capture mode, {prodname} provides a set of user-defined functions (UDFs) for your convenience. The procedure here shows how to install and run these management UDFs. Alternatively, you can run Db2 control commands to put tables into capture mode. +For {prodname} to capture change events that are committed to Db2 tables, a Db2 database administrator with the necessary privileges must configure tables in the database for change data capture. +After you begin to run {prodname} you can adjust the configuration of the capture agent to optimize performance. -This procedure assumes that you are logged in as the `db2instl` user, which is the default instance and user name when using the Db2 docker container image. +ifdef::product[] + +For details about setting up Db2 for use with the {prodname} connector, see the following sections: + +* xref:configuring-db2-tables-for-change-data-capture[] +* xref:effect-of-db2-capture-agent-configuration-on-server-load-and-latency[] +* xref:db2-capture-agent-configuration-parameters[] + +endif::product[] + +// Type: procedure +// ModuleID: configuring-db2-tables-for-change-data-capture +// Title: Configuring Db2 tables for change data capture +=== Putting tables into capture mode + +To put tables into capture mode, {prodname} provides a set of user-defined functions (UDFs) for your convenience. +The procedure here shows how to install and run these management UDFs. Alternatively, you can run Db2 control commands to put tables into capture mode. +The administrator must then enable CDC for each table that you want Debezium to capture. .Prerequisites * On the machine on which Db2 is running, the content in `debezium-connector-db2/src/test/docker/db2-cdc-docker` is available in the `$HOME/asncdctools/src` directory. +* You are logged in as the `db2instl` user, which is the default instance and user name when using the Db2 docker container image. .Procedure @@ -1427,14 +1446,14 @@ VALUES ASNCDC.ASNCDCSERVICES('reinit','asncdc'); {link-prefix}:{link-db2-connector}#managing-debezium-db2-connectors[Reference table for {prodname} Db2 management UDFs] -// Type:concept -// ModuleID: how-the-db2-capture-agent-configuration -=== How the Db2 capture agent configuration affects latency, server load, and performance +// Type: concept +// ModuleID: effect-of-db2-capture-agent-configuration-on-server-load-and-latency +=== Effect of Db2 capture agent configuration on server load and latency When a database administrator enables change data capture for a source table, the capture agent begins to run. The agent reads new change event records from the transaction log and replicates the event records to a capture table. Between the time that a change is committed in the source table, and the time that the change appears in the corresponding change table, there is always a small latency interval. -This latency interval represents a gap between when changes occur in the source table and when they become available for {prodname} to stream to Kafka. +This latency interval represents a gap between when changes occur in the source table and when they become available for {prodname} to stream to Apache Kafka. Ideally, for applications that must respond quickly to changes in data, you want to maintain close synchronization between the source and capture tables. You might imagine that running the capture agent to continuously process change events as rapidly as possible might result in increased throughput and reduced latency -- @@ -1445,25 +1464,34 @@ Each time that the change agent queries the database for new event records, it i The additional load on the server can have a negative effect on overall database performance, and potentially reduce transaction efficiency, especially during times of peak database use. It's important to monitor database metrics so that you know if the database reaches the point where the server can no longer support the capture agent's level of activity. -If you notice performance problems, there are Db2 capture agent settings that you can modify to help balance the overall CPU load on the database host with a tolerable degree of latency. +If you experience performance issues while running the capture agent, adjust capture agent settings to reduce CPU load. -.Capture agent tuning parameters -On Db2, the `IBMSNAP_CAPPARMS` table contains parameters that control the operations of the capture agent. -Should you experience performance issues on Db2, you can configure the capture process by adjusting the values for these parameters. -Specifying the exact values to set for these parameters is beyond the scope of this documentation. +// Type: reference +// ModuleID: db2-capture-agent-configuration-parameters +=== Db2 capture agent configuration parameters -There are multiple capture agent parameters in the `IBMSNAP_CAPPARMS` table. -The following parameters are the most significant for modifying capture agent behavior for use with the {prodname} Db2 connector. +On Db2, the `IBMSNAP_CAPPARMS` table contains parameters that control the behavior of the capture agent. +You can adjust the values for these parameters to balance the configuration of the capture process to reduce CPU load and still maintain acceptable levels of latency. -`COMMIT_INTERVAL`:: Specifies the number of seconds that the capture agent waits to commit data to the change data tables. -A lower value results in the change table receiving a greater number of commits in a shorter time period (lower latency). -Specify a larger commit interval results in batch processing of the replication workload. +[NOTE] +==== +Specific guidance about how to configure Db2 capture agent parameters is beyond the scope of this documentation. +==== -`SLEEP_INTERVAL`:: Specifies the number of seconds that the capture agent waits to start a new commit cycle after it reaches the end of the active transaction log. -Higher values reduce the number of commit cycles +In the `IBMSNAP_CAPPARMS` table, the following parameters have the greatest effect on reducing CPU load: + +`COMMIT_INTERVAL`:: +* Specifies the number of seconds that the capture agent waits to commit data to the change data tables. +* A higher value reduces the load on the database host and increases latency. +* The default value is `30`. + +`SLEEP_INTERVAL`:: +* Specifies the number of seconds that the capture agent waits to start a new commit cycle after it reaches the end of the active transaction log. +* A higher value reduces the load on the server, and increases latency. +* The default value is `5`. .Additional resources -* For more information about capture agent parameters, see the documentation for your Db2 database. +* For more information about capture agent parameters, see the Db2 documentation. // Type: assembly // ModuleID: deploying-debezium-db2-connectors diff --git a/documentation/modules/ROOT/pages/connectors/sqlserver.adoc b/documentation/modules/ROOT/pages/connectors/sqlserver.adoc index 63fa53412..c8411e168 100644 --- a/documentation/modules/ROOT/pages/connectors/sqlserver.adoc +++ b/documentation/modules/ROOT/pages/connectors/sqlserver.adoc @@ -1333,6 +1333,8 @@ For details about setting up SQL Server for use with the {prodname} connector, s * xref:enabling-cdc-on-a-sql-server-table[] * xref:verifying-debezium-connector-access-to-the-cdc-table[] * xref:debezium-sql-server-connector-on-azure[] +* xref:effect-of-sql-server-capture-job-agent-configuration-on-server-load-and-latency[] +* xref:sql-server-capture-job-agent-configuration-parameters[] endif::product[] @@ -1475,19 +1477,73 @@ ifdef::community[] === SQL Server Always On The SQL Server connector can capture changes from an Always On read-only replica. -A few pre-requisities are necessary to be fulfilled: +.Prerequisites * Change data capture is configured and enabled on the primary node. SQL Server does not support CDC directly on replicas. -* The configuration option `database.applicationIntent` must be set to `ReadOnly`. +* The configuration option `database.applicationIntent` is set to `ReadOnly`. This is required by SQL Server. -When {prodname} detects this configuration option then it will: +When {prodname} detects this configuration option, it responds by taking the following actions: -** set `snapshot.isolation.mode` to `snapshot` as this is the only one transaction isolation mode supported by read-only replicas -** commit the (read-only) transaction in every execution of the streaming query loop, as this is necessary to get the latest view on CDC data +** Sets `snapshot.isolation.mode` to `snapshot`, which is the only one transaction isolation mode supported for read-only replicas. +** Commits the (read-only) transaction in every execution of the streaming query loop, which is necessary to get the latest view of CDC data. endif::community[] +// Type: concept +// ModuleID: effect-of-sql-server-capture-job-agent-configuration-on-server-load-and-latency +=== Effect of SQL Server capture job agent configuration on server load and latency + +When a database administrator enables change data capture for a source table, the capture job agent begins to run. +The agent reads new change event records from the transaction log and replicates the event records to a change data table. +Between the time that a change is committed in the source table, and the time that the change appears in the corresponding change table, there is always a small latency interval. +This latency interval represents a gap between when changes occur in the source table and when they become available for {prodname} to stream to Apache Kafka. + +Ideally, for applications that must respond quickly to changes in data, you want to maintain close synchronization between the source and change tables. +You might imagine that running the capture agent to continuously process change events as rapidly as possible might result in increased throughput and reduced latency -- +populating change tables with new event records as soon as possible after the events occur, in near real time. +However, this is not necessarily the case. +There is a performance penalty to pay in the pursuit of more immediate synchronization. +Each time that the capture job agent queries the database for new event records, it increases the CPU load on the database host. +The additional load on the server can have a negative effect on overall database performance, and potentially reduce transaction efficiency, especially during times of peak database use. + +It's important to monitor database metrics so that you know if the database reaches the point where the server can no longer support the capture agent's level of activity. +If you notice performance problems, there are SQL Server capture agent settings that you can modify to help balance the overall CPU load on the database host with a tolerable degree of latency. + +// Type: reference +// ModuleID: sql-server-capture-job-agent-configuration-parameters +=== SQL Server capture job agent configuration parameters + +On SQL Server, parameters that control the behavior of the capture job agent are defined in the SQL Server table link:https://docs.microsoft.com/en-us/sql/relational-databases/system-tables/dbo-cdc-jobs-transact-sql?view=latest[`msdb.dbo.cdc_jobs`]. +If you experience performance issues while running the capture job agent, adjust capture jobs settings to reduce CPU load by running the link:https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sys-sp-cdc-change-job-transact-sql?view=latest[`sys.sp_cdc_change_job`] stored procedure and supplying new values. + +[NOTE] +==== +Specifying the exact values to set for SQL Server capture job agent parameters is beyond the scope of this documentation. +==== + +The following parameters are the most significant for modifying capture agent behavior for use with the {prodname} SQL Server connector: + +`pollinginterval`:: +* Specifies the number of seconds that the capture agent waits between log scan cycles. +* A higher value reduces the load on the database host and increases latency. +* A value of `0` specifies no wait between scans. +* The default value is `5`. + +`maxtrans`:: +* Specifies the maximum number of transactions to process during each log scan cycle. +After the capture job processes the specified number of transactions, it pauses for the length of time that the `pollinginterval` specifies before the next scan begins. +* A lower value reduces the load on the database host and increases latency. +* The default value is `500`. + +`maxscans`:: +* Specifies a limit on the number of scan cycles that the capture job can attempt in capturing the full contents of the database transaction log. +If the `continuous` parameter is set to `1`, the job pauses for the length of time that the `pollinginterval` specifies before it resumes scanning. +* A lower values reduces the load on the database host and increases latency. +* The default value is `10`. + +.Additional resources +* For more information about capture agent parameters, see the SQL Server documentation. // Type: assembly // ModuleID: deploying-and-managing-debezium-sql-server-connectors