DBZ-3401 Document new hybrid mining strategy

This commit is contained in:
Chris Cranford 2024-02-07 12:52:19 -05:00 committed by Chris Cranford
parent 648db88868
commit e4e946fdd2

View File

@ -3623,7 +3623,10 @@ This also enables tracking DDL changes against captured tables, so if the schema
+
`online_catalog`:: Uses the database's current data dictionary to resolve object ids and does not write any extra information to the online redo logs.
This allows LogMiner to mine substantially faster but at the expense that DDL changes cannot be tracked.
If the captured table(s) schema changes infrequently or never, this is the ideal choice.
If the captured table(s) schema changes infrequently or never, this is the ideal choice. +
+
`hybrid`:: Uses a combination of the database's current data dictionary and the {prodname} in-memory schema model to resole table and column names seamlessly.
This mode performs at the level of the `online_catalog` LogMiner strategy with the schema tracking resilience of the `redo_log_catalog` strategy while not incurring the overhead of archive log generation and performance costs of the `redo_log_catalog` strategy.
|[[oracle-property-log-mining-query-filter-mode]]<<oracle-property-log-mining-query-filter-mode, `+log.mining.query.filter.mode+`>>
|`none`
@ -5061,19 +5064,29 @@ This will cause changes that occurred between the old SCN value and the newly pr
This is not recommended.
*What's the difference between the various mining strategies?*::
The {prodname} Oracle connector provides two options for `log.mining.strategy`.
The {prodname} Oracle connector provides three options for `log.mining.strategy`.
+
The default is `redo_in_catalog`, and this instructs the connector to write the Oracle data dictionary to the redo logs everytime a log switch is detected.
This data dictionary is necessary for Oracle LogMiner to track schema changes effectively when parsing the redo and archive logs.
This option will generate more than usual numbers of archive logs but allows tables being captured to be manipulated in real-time without any impact on capturing data changes.
This option generally requires more Oracle database memory and will cause the Oracle LogMiner session and process to take slightly longer to start after each log switch.
+
The alternative option, `online_catalog`, does not write the data dictionary to the redo logs.
The second option, `online_catalog`, does not write the data dictionary to the redo logs.
Instead, Oracle LogMiner will always use the online data dictionary that contains the current state of the table's structure.
This also means that if a table's structure changes and no longer matches the online data dictionary, Oracle LogMiner will be unable to resolve table or column names if the table's structure is changed.
This mining strategy option should not be used if the tables being captured are subject to frequent schema changes.
It's important that all data changes be lock-stepped with the schema change such that all changes have been captured from the logs for the table, stop the connector, apply the schema change, and restart the connector and resume data changes on the table.
This option requires less Oracle database memory and Oracle LogMiner sessions generally start substantially faster since the data dictionary does not need to be loaded or primed by the LogMiner process.
+
The final option, `hybrid`, combines the strengths of the above two strategies with none of their weaknesses.
This strategy harnesses the performance of the `online_catalog` with the resilience in schema tracking of the `redo_in_catalog` while also avoiding the overhead and performance costs with the higher than normal archive log generation.
This mode utilizes a fallback mode where if LogMiner fails to reconstruct the SQL for a database change, the {prodname} connector will rely on the in-memory schema model maintained by the connector to reconstruct the SQL in-flight.
The intent is that this mode will eventually transition to the default, and likely only mode of operation in the future.
*Are there any limitations with the Hybrid mining strategy with LogMiner?*::
Yes, the Hybrid mode for `log.mining.strategy` is still a work-in-progress strategy, and therefore does not yet support all data types.
At this time, this mode cannot reconstruct SQL statements that include operations against `CLOB`, `NCLOB`, `BLOB`, `XML`, nor `JSON` data types.
So in short, if you enable `lob.enabled` with a value of `true`, you will be unable to use the Hybrid strategy and the connector will fail to start as this combination is unsupported.
*Why does the connector appear to stop capturing changes on AWS?*::
Due to the https://aws.amazon.com/blogs/networking-and-content-delivery/best-practices-for-deploying-gateway-load-balancer[fixed idle timeout of 350 seconds on the AWS Gateway Load Balancer],