As a part of this work to handle injection in a cleaner way, this commit
adds two new broad concepts called `BeanRegistry` and `ServiceRegistry`.
A BeanRegistry is a glorified registry of different objects that are not
necessarily services but may be desired by a service. This contract will
allow Debezium to integrate in the future with other CDI providers.
A ServiceRegistry is more of an internal concept, where various systems
can be started based on their dependency order and provides a universal
way to split larger parts of the code into smaller, focused modules that
can be accessed using the Service Locator pattern.
The snapshot phase was not setting the unavailable value placeholder when the
user had configured LOB as off, this aligns that behavior to be consistent
with the behavior from streaming.
also makes sure that events are correctly removed in ISPN event processor after transaction is abandoned.
Also fixes scenario with event number based threshold abandonment in ISPN - events comming afterwards would be still processed.
There is a corner case where it's possible the Oracle connector may query
the Oracle metadata tables quicker than the ARC process can generate an
archive log history record in V$ARCHIVED_LOG, and this can lead to a race
condition where we may incorrectly advance the connector forward to start
mining a group of logs when a log sequence gap exists in the log ranges.
For users who use the online_catalog strategy, there are some checks that
LogMiner does automatically which it skips, and one is with log sequence
gaps. This fix enforces that check by Debezium even for users who may use
the faster online_catalog mode so that no logs are omitted and events
could be missed.
There was a possible situation where if a long transaction consisted of
updating and inserting into the same table with identical keys with a
given sequence that the commit handler would merge several events for a
table without LOB columns, resulting in a difference in expected events
in the Kafka topic vs what was seen in LogMiner.
This config will be re-used by possible other implementations of
DebeiumEngine API in the embedded package. As DebeziumEngine API
can have completely different implementations and thus also config,
the class is called `EmbeddedEngineConfig` as it's assumed to be used
only by embedded engine "family" of implementations.
To keep backward compatibility, the config options are extracted into
an interface and `EmbeddedEngine` implements this interface, thus
allowing to use these options in custom classes without any need for the
code changes.
It is recommended by Infinispan that specific calls that return a collection
of elements should be treated as a closable object so that any and all the
potential resources associated with the operation are closed.
Add a new internal `log.mining.schema_changes.username.exclude.list` to allow users
to customize the default behavior for excluding the SYS and SYSTEM usernames from
DDL changes.
In some corner cases, users may have unusually large SQL statements that
need to be buffered due to the number of columns paired with the data in
those columns. Previously we capped this to 4000*10 or 40kb primarily to
address situations with LOB operations that could lead to OOM scenarios.
The new code rather logs a warning when exceeding 100kb and hard faults
ony when the connector sees Integer.MAX_VALUE number of SQL lines for a
single SQL buffer.
Pending transactions with a START_SCN of 0 are considered transactions
that have started before the oldest available archive log and these
will be ignored as the entire transaction cannot be mined.
This returns the age in milliseconds from the poll time for the age of the oldest
transaction's starting system change number in the transaction buffer.
The SCN data types were previously exposed as `String` types, which is not
consumable by Grafana and Prometheus. By using `BigInteger`, we can now
make these accessible on dashboards.
When a user supplies a column visibility clause in an ALTER TABLE statement,
there are no "modify_col_properties" clauses present, and this will lead to
the aforementioned exception. The listener should be tolerant of this case
and should not initialize any column editors.
The shouldCaptureChangesForTransactionsAcrossSnapshotBoundaryWithoutReemittingDDLChanges test
only expects the tables created by the entire test to exist but tables from other tests not,
and it would appear this commonly happens when another test fails to cleanup after itself.
This fix is to guarantee that the Oracle database state is set properly so that tests from
within this class are executed with the right number of tables expected to exist.
We hypothesize that there could be a situation where we may be mining precisely
around the CURRENT_SCN and this may lead to situations where LGWR may not have
flushed all records for the same SCN before being mined by the connector.
Introduces a new configuration option, `log.mining.restart.connection`,
which closes and re-opens the JDBC connection when an Oracle Log switch
occurs or when the optionally configured log mining session max lifetime
is reached.
Currently we scan all the tables, which may result into a substantial
delay in initial snapshot when the database is very large. We need to
filter out tables which we are not interested in.
Add back table filter when loading schema of tables. As per comment of
this block of code, passing all tables and table filter should be faster
than passing only list of tables we are interested in.
DBZ-1973 Add more tests for other connectors
DBZ-1973 Add send method with offset parameter
DBZ-1973 Instantiate NotificationService in the task class
DBZ-1973 Instantiate NotificationService in the task class
DBZ-4027 Move specific sink channel configuration to SinkNotificationChannel
DBZ-4027 Remove not used SPI file
DBZ-1973 Moved SPI file definition to debezium-core
DBZ-1973 Rename KafkaNotificationChannel to more generic SinkNotificationChannel
DBZ-1973 Code refactor
DBZ-1973 Improve configuration property description
DBZ-1973 Improve test
DBZ-1973 Add close method to NotificationChannel
DBZ-1973 Implement KafkaNotificationChannel
DBZ-1973 Add NotificationService and LogNotificationChannel
This change explicitly captures the state of two critical tables when the connector
fails to start a LogMiner session, namely V$LOGMNR_LOGS and V$LOGMNR_PARAMETERS.
While these were previously captured by the LogMinerDatabaseStateWriter, the old
code required DEBUG log level and that isn't always feasible, so this change will
capture these in ERROR log level regardless moving forward.
DBZ-4027 Add an Incremental snapshot test with kafka signaling
DBZ-4027 Add an Incremental snapshot test with kafka signaling
DBZ-4027 Add an Incremental snapshot test with kafka signaling
DBZ-4027 Code style
DBZ-4027 Make SignalPayload more generic and extensible
DBZ-4027 Rename DatabaseSignalChannel to SourceSignalChannel
DBZ-4027 Improve logging
DBZ-4027 Moved SPI file definition to debezium-core
DBZ-4027 Move SignalProcessor synchronization point to be processed only when a signal cdc event arrives.
DBZ-4027 Add EventDispatcher constructor without signalProcessor for spanner connector
DBZ-4027 Fix NPE
DBZ-4027 Fix NPE
DBZ-4027 Formatting
DBZ-4027 Correctly manage signal on not supported connector
DBZ-4027 Use the correct MongoDbOffset
DBZ-4027 Correctly initialize offset for Oracle and SqlServer connectors
DBZ-4027 Register SPI implementations
DBZ-4027 Improve SignalProcessor instantiation
DBZ-4027 Pass source info in case of SchemaChanges action
DBZ-4027 Manage close event in a synchronous way
DBZ-4027 Correctly init offset context also in case of snapshot mode 'never'
DBZ-4027 Fix MySqlMetricsIT test
DBZ-4027 Move KafkaSignalChannel to core
DBZ-4027 Move KafkaSignalChannel to core
DBZ-4027 Set pass offset context after initial snapshot to SignalProcessor
DBZ-4027 Pass OffsetContext to signal processor
DBZ-4027 Pass CommonConnectorConfig to SignalChannelReader init method
DBZ-4027 Move Incremental snapshot window actions to dedicated package
DBZ-4027 Align SignalsIT test with new code
DBZ-4027 Fix SignalsIT test
DBZ-4027 Fix SignalProcessor scheduling
DBZ-4027 Moved DatabaseSignalChannel and SignalChannelReader to dedicated package
DBZ-4027 Start SignalProcessor from ChangeEventSourceCoordinator
DBZ-4027 Create SignalProcessor and renamed Signal to DatabaseSignalChannel
DBZ-4027 Initial refactoring of signal feature
There is the potential when using multiple connector deployments on Oracle that
there could be some level of lock contention with all connectors using the same
table. This allows users to configure the flush table name themselves, reducing
the lock contention across multiple connector deployments.
The default Debezium Oracle images are pre-configured with archive log mode
enabled and this isn't something we can simply turn off and back on in the
test suite proper. This test must be invoked manually and separately, so it
is disabled by default in the overall test suite execution.
This avoids a scenario where we want to increment the batch size; however, the
current batch size is equal to the max and therefore triggers a decrement and
the next iteration increments. The new behavior will be that we don't trigger
this ping pong. With this change, we can track specifically when we reach the
max batch size and only log the warning once. If the batch size drops, and it
later increments, the warning will be logged again but this should be expected.
Using current batch size for comparison is wrong in case that when currentScn is topScn as described in DBZ-6155.
In the other case (when topScn is behind currentScn) can eventually lead to this situation.
when topScn would fall behind currentScn more and more as we would compare currentScn - topScn with bigger and bigger number (current batch size).
Comapring with logMiningBatchSizeMin could result in very small window which would mean we will send many small queries instead of several bigger ones during mining.
Therefore reverting back to do the adjustments based on the defaultBatchSize.
The "fetch-state" attribute was deprecated and is no longer a valid option
with Infinispan 14, which causes the tests to fail to execute. Additionally,
the "segmented" attribute must be set to true going forward as the file
store implementation no longer supports non-segmented configurations.
We load all the schemas of the captured tables when the connector
starts. If we process a record from a table which schema is not
available, this means we have some bug in the intial schema loading.
Don't fail in such case, but print a warning about that.
When taking a snapshot, the Oracle connector was converting the TIMESTAMP
WITH TIME ZONE value to GMT and per the documentation, the value should
be emitted in the time zone of the data.
The snapshot emitted value in GMT is temporally accurate, so there is no
data inconsistency, but the emitted format itself was inconsistent when
looking at how the column data was emitted during a snapshot versus in a
streaming event.
DBZ-5648 introduced a regression where transaction start, commit, and rollback
events were only being read from within the scope of the configured PDB that
the connector was capturing changes instead of the entire Oracle database.
This can lead to situations where the offsets may not be advanced as quickly
in a low traffic PDB environment, potentially causing stale offsets.