Commit Graph

677 Commits

Author SHA1 Message Date
Debezium Builder
ea182d83f3 [maven-release-plugin] prepare for next development iteration 2024-04-02 07:38:53 +00:00
Debezium Builder
7dcd94d19e [maven-release-plugin] prepare release v2.6.0.Final 2024-04-02 07:38:53 +00:00
Vojtech Juranek
d88fd9e2e7 DBZ-7710 Remove unneded copying from RecordProcessors 2024-03-27 14:05:11 +01:00
Vojtech Juranek
6b35efc00a DBZ-7661 Don't propagate cancelation exption when polling is stopped
`CancellationException` should be thrown mostly in the test when we want
to stop as soon as possible and may not wait for polling task to finish.
2024-03-25 17:48:31 +01:00
Vojtech Juranek
8e0c6ad88e DBZ-7661 Close properly offset backing store 2024-03-25 17:48:31 +01:00
Vojtech Juranek
16a089abb6 DBZ-7661 Always shut down record processing thread pool 2024-03-25 17:48:31 +01:00
Debezium Builder
4df18d9f43 [maven-release-plugin] prepare for next development iteration 2024-03-25 09:57:05 +00:00
Debezium Builder
9656da1fad [maven-release-plugin] prepare release v2.6.0.CR1 2024-03-25 09:57:04 +00:00
Chris Cranford
200f9ed28e DBZ-7596 Improved tests for reselect post processor 2024-03-18 13:44:14 +01:00
Chris Cranford
0e267d8ef6 DBZ-7596 Support reselection of PostgreSQL hstore values 2024-03-18 13:44:14 +01:00
Andreas Martens
adabb899f8 DBZ-7614: expand scope of catch furing Engine validation 2024-03-11 15:41:30 +01:00
Andreas Martens
c74c6d6188 DBZ-7614: indent changes in EmbeddedEngine 2024-03-11 15:41:30 +01:00
Andreas Martens
85aea68c9a DBZ-7614: catch exception during validation 2024-03-11 15:41:30 +01:00
Debezium Builder
2fb8fc3004 [maven-release-plugin] prepare for next development iteration 2024-03-06 07:47:33 +00:00
Debezium Builder
cd46b2b998 [maven-release-plugin] prepare release v2.6.0.Beta1 2024-03-06 07:47:33 +00:00
mfvitale
9ad4273791 DBZ-7303 Align snapshot modes for SqlServer connector 2024-03-04 11:17:06 +01:00
mfvitale
211675a355 DBZ-7461 Rename shouldSnapshot to shouldSnapshotData 2024-03-01 14:12:31 +01:00
akula
cd4c6958bd DBZ-7512 Support arbitrary payloads with outbox event router on
debezium server

1. Support for string and binary serialization formats on debezium api.
2. Allow configuring separate key and value formats on embedded engine.

This change fixes the following issue using outbox event router on
embedded engine:

Outbox event router supports arbitrary payload formats with
BinaryDataConverter as the value.converter which passes payload
transparently. However this is  currently not supported with the
embedded engine which handles message conversion using value.format to
specify the format.

In addition, when we want to pass payload transparently, it makes
sense to also pass aggregateid i.e. the event key transparently. The
default outbox table configuration specifies aggregateid as a
varchar which is also not supported by embedded engine.
2024-03-01 08:23:47 +01:00
Vojtech Juranek
6cc68fdfe3 DBZ-7568 Use default engine wait time in all async engine latches 2024-02-29 13:44:54 +01:00
Vojtech Juranek
6928cd775c DBZ-7568 Switch waitTimeForEngine() to seconds and increase wait default time 2024-02-29 13:44:54 +01:00
James Johnston
f632fa081e DBZ-5071 Correctly handle NULL values in incremental snapshots
It turns out that the existing code for chunking a table when taking
an incremental snapshot was buggy and did not correctly handle NULL
values when building the chunk query.  An example of such a situation
would be when the user has specified "message.key.columns" to reference
a column that is part of a PostgreSQL UNIQUE INDEX that was created with
the NULLS NOT DISTINCT option.

This commit updates the new AbstractChunkQueryBuilder so that it checks
whether a key column is optional.  If it is, then additional will
appropriately consider NULL values when generating a chunk query using
"IS [NOT] NULL" clauses.

One complication is that different database engines have different
sorting behavior of ORDER BY.  It is apparently not well-defined by the
SQL standard.  Some databases consider NULL values to be higher than any
non-NULL values, and others consider them to be lower.

To handle this situation, a new nullsSortLast() function is added to the
JdbcConnection class.  By default, it returns an empty value, indicating
that the behavior of the database engine is unknown.  When an optional
field is encountered by AbstractChunkQueryBuilder in this situation, we
throw an error because we don't actually know how to correctly chunk the
query: there's no safe assumption that can be made here.

Derived JdbcConnection classes can then override the nullsSortLast
function, and return a value indicating the actual behavior of that
database engine.  When this is done, the AbstractChunkQueryBuilder then
knows how to correctly build a chunk query that can handle NULL values.

To help test this, new tests have been added to
AbstractIncrementalSnapshotTest.  First, the existing insertsWithoutPks
test has been moved and deduplicated from MySQL and PostgreSQL so that
the test case can be reused on other engines.  Second, a new
insertsWithoutPksAndNull test is run, which inserts data with NULL
values in the message key columns.  To demonstrate that chunk queries
are being correctly generated for practically every case, the
INCREMENTAL_SNAPSHOT_CHUNK_SIZE is set to 1 so that NULL values are not
returned in the middle of a chunk, which can cause us to skip testing
the code we need to test.
2024-02-29 13:36:26 +01:00
Xianming Zhou
86c1dac16a DBZ-7517 Remove the unused 'connector' parameter in the createSourceTask method in EmbeddedEngine.java 2024-02-22 09:07:55 +01:00
Jiri Pechanec
df18e00173 DBZ-7535 Use error level for error message 2024-02-22 08:31:58 +01:00
Vojtech Juranek
7edefef14c DBZ-7535 Ensure at least one task starts and one fails
Add also `INFO` log with number of failed tasks.
2024-02-22 08:31:58 +01:00
Chris Cranford
68f31f7662 [ci] Log records found when expecting no records 2024-02-21 09:00:45 -05:00
Chris Cranford
93cf3b06bf DBZ-7516 Correctly enable table CDC to avoid failures 2024-02-20 10:59:39 +01:00
Vojtech Juranek
71b04b351e DBZ-7495 Remove low task stop timeout in testsuite
This leads to random test failures and moreover it eventually overrides
`task.management.timeout.ms` configured in concrete tests.
2024-02-19 08:45:33 +01:00
Vojtech Juranek
ae53895cd8 DBZ-7495 Define constant for executor shutdown timeout
Unify executor shutdown timeout for executor services in the code base.
2024-02-19 08:45:33 +01:00
Vojtech Juranek
6c7abe7317 DBZ-7496 Add info how long it takes to stop the task
Mostly to stabilize the testsuite and find the right value for the
default task start/stop timeout. This info may be inaccurate for
mutiple tasks as we eventually call `ConnectorCallback::taskStopped`
callback and this time is added to the next task stop time.
2024-02-19 08:45:33 +01:00
Vojtech Juranek
46fa5e79b9 DBZ-7496 Intorduce configurable async engine timeout in tests
The defaul async engine timeout to start and stop is 1 second, but it's
configurable via `debezium.test.engine.waittime` system property.
2024-02-19 08:45:33 +01:00
Vojtech Juranek
ecc4c096ab DBZ-7496 Refactor run method to keep it short 2024-02-19 08:45:33 +01:00
Vojtech Juranek
a2c249ae33 DBZ-7496 Make sure completion callback is called after connector shutdown
Engine is typically run in a different thread when one from which the
`close()` method si called. During the call of `close()` method, we stop
task polling and `run()` method may move to `finally` block, calling
completetion callback before we return from `close()` method and thus
e.g. even before calling stop of the connector.

Make sure engine state is moved to `STOPPED` and completion callback is
called after engine is really stopped and `close()` method has finished.
2024-02-19 08:45:33 +01:00
Vojtech Juranek
2c61cc7293 DBZ-7496 Fix await conditions, add logging 2024-02-19 08:45:33 +01:00
Jiri Pechanec
7cc8459cd5 DBZ-7488 Keep assertion and switch expected result 2024-02-16 12:33:33 +01:00
Chris Cranford
09e1bf1df0 DBZ-7488 Skip re-selection on r (read) events 2024-02-16 12:33:33 +01:00
Debezium Builder
10e327602c [maven-release-plugin] prepare for next development iteration 2024-02-13 09:20:04 +00:00
Debezium Builder
0c5b05738c [maven-release-plugin] prepare release v2.6.0.Alpha2 2024-02-13 09:20:04 +00:00
Vojtech Juranek
7789d995e5 DBZ-7024 Add possibility to specify engine builder factory
Also add converting builder factory for async engine into SPI service.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
dbdb052535 DBZ-7024 Add converting builder for async engine 2024-02-12 13:43:21 +01:00
Vojtech Juranek
004ebeff16 DBZ-7024 Move creation of converters into dedicated class 2024-02-12 13:43:21 +01:00
Vojtech Juranek
b74a0eb2c2 DBZ-7024 Move RecordProcessors into separate classes 2024-02-12 13:43:21 +01:00
Vojtech Juranek
eef8ee4cea DBZ-7024 Move async engine into separate package 2024-02-12 13:43:21 +01:00
Vojtech Juranek
425407331c DBZ-7024 Add TODO item for improving ConnectorCallback API 2024-02-12 13:43:21 +01:00
Vojtech Juranek
3edc61e443 DBZ-7024 Improve processor instantiation 2024-02-12 13:43:21 +01:00
Vojtech Juranek
cdf5e0255a DBZ-7024 Improve log level and log messges 2024-02-12 13:43:21 +01:00
Vojtech Juranek
a04dc84b3e DBZ-7024 Embedde state comparions into State enum methods
It's more safe to have the comparions directly in the enum and also make
obvisou that the ordering of enum is important.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
5c21d19815 DBZ-7024 Use enum for record processing order option 2024-02-12 13:43:21 +01:00
Vojtech Juranek
ee1f33fe33 DBZ-7024 Limit size of records processing thread pool
If the number of threads is provided as a number, limit it to 16 threads
to avoid possible overhead with context switching on a beefy machines
where the default value using all available cores may result in many
threads, which would be waiting most of the time anyway, as such machine
may run probably many other tasks, not only Debezium.

If the user really wants to use all available cores, it can be specified
using `AVAILABLE-CORES` placeholder.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
e2d2cff7fd DBZ-7024 Interrupt polling if needed
Some polling tasks may be stuck and we need to interrupt polling during
the shutdown not have to wait for TASK_MANAGEMENT_TIMEOUT_MS to timeout.

Also, when we start to interrput polling, we have to remove interruption
of the main thread in the `catch` part. It was a bug anyway as it
interrputed the main thread what we definitelly don't want to happen in
any case.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
40131c0531 DBZ-7024 Increase task management timeout to 2min
Increase task management timeout to two minutes and make this option
internal. This timeout will be hopefully sufficient for most of the
deployments. If not, we will increase the timeout it make this option
public.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
fc7381ad91 DBZ-7024 Improve javadoc and comments 2024-02-12 13:43:21 +01:00
Vojtech Juranek
8bb1a122b1 DBZ-7024 Add missing condition 2024-02-12 13:43:21 +01:00
Vojtech Juranek
0f7d3100b4 DBZ-7024 Add option for creating default ChangeConsumer
This option effective allowes the user to request serial processing of
the records byt the provided Consumer.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
cc5f7aedd1 DBZ-7024 Don't provide default ChangeConsumer
To allow user to use different processors, don't provide the default
ChangeHandler.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
de2e4aba9f DBZ-7024 Add debug logging when selecting the processor 2024-02-12 13:43:21 +01:00
Vojtech Juranek
001cb2a640 DBZ-7024 Improve javadocs and comments, fix typos 2024-02-12 13:43:21 +01:00
Vojtech Juranek
4689db90c0 DBZ-7024 Switch abstract embedded tests to async engine 2024-02-12 13:43:21 +01:00
Vojtech Juranek
dfdeab7ab8 DBZ-7024 Add method to await engine shutdown 2024-02-12 13:43:21 +01:00
Vojtech Juranek
3ec22951ce DBZ-7024 Create testing engine and base class for async engine tests 2024-02-12 13:43:21 +01:00
Vojtech Juranek
293b84645d DBZ-7024 Make default RecordCommitter thread unsafe
Default implmentation of `RecordCommitter`, the `SourceRecordCommitter`,
is always created for each task and withing given task is called
sequentially, always in the same thread. There's no need to aquire locks
for each method call.

Make `SourceRecordCommitter` thread unsafe.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
998f00f811 DBZ-7024 Create initial implementation of async embedded engine
Implementation is based on the proposed Debezium design document for
asynchronous embedded engine, which is currently still WIP:
https://github.com/debezium/debezium-design-documents/pull/8
2024-02-12 13:43:21 +01:00
Vojtech Juranek
69bbed1fa3 DBZ-7024 Allow to override type into AbstractConnectorTest 2024-02-12 13:43:21 +01:00
Vojtech Juranek
1eec31b3f3 DBZ-7024 Allow to specify multiple records when error should be thrown during processing 2024-02-12 13:43:21 +01:00
Vojtech Juranek
542b0fec7f DBZ-7024 Intorduce retryable callable 2024-02-12 13:43:21 +01:00
Vojtech Juranek
d7b7768071 DBZ-7024 Add more testing connectors
Add connector which runs mutiple tasks and connector whose some of
the tasks fail.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
7eaf0fc288 DBZ-7024 Move reuseable testing functions for DebeziumEngine into common util class 2024-02-12 13:43:21 +01:00
Vojtech Juranek
b8e16ee89f DBZ-7024 Move reuseable interface implementations for DebeziumEngine into common class 2024-02-12 13:43:21 +01:00
Vojtech Juranek
ba35b395c5 DBZ-7024 Move required setup into EmbeddedWorkerConfig
Move required by `WorkerConfig` validators into `EmbeddedWorkerConfig`
so we have all Kafka related stuff in the same class.
2024-02-12 13:43:21 +01:00
Vojtech Juranek
4452e3d095 DBZ-7024 Move EmbeddedConfig into separate class
`EmbeddedConfig` needs to be shared with other implmentations of
`DebeziumEngine` as long as Debezium embedded depends on the Kafka
model.
2024-02-12 13:43:21 +01:00
Chris Cranford
a597a82c19 DBZ-7439 Remove unnecessary log/output entries 2024-02-07 15:00:27 +01:00
Chris Cranford
de9364bb4d DBZ-7439 Fix test compatibility 2024-02-07 15:00:27 +01:00
Debezium Builder
65d63ed42d [maven-release-plugin] prepare for next development iteration 2024-01-21 10:12:45 +00:00
Debezium Builder
485fa82a8f [maven-release-plugin] prepare release v2.6.0.Alpha1 2024-01-21 10:12:44 +00:00
Artem Shubovych
27f42d101a DBZ-7342 Replace temporary variable with immediate and early return 2024-01-17 09:50:54 +01:00
Artem Shubovych
d9de6ceba2 DBZ-7342 Respect the max error retries setting 2024-01-17 09:50:52 +01:00
Roman Kudryashov
0c80f1f38d DBZ-7284 Provide config option to customize CloudEvents.data schema name 2024-01-11 13:20:35 +01:00
Jakub Cechacek
aa0c53ec2a DBZ-7260 Offset consolidation test coverage 2024-01-11 09:58:24 +01:00
mfvitale
47cbdee526 DBZ-7311 Permits to execute a blocking snapshot even if snapshot.mode=never 2024-01-08 12:52:03 +01:00
Jiri Pechanec
7a80c9dae8 DBZ-7098 Converting engine should honor unsupported tombstone flag 2024-01-05 08:25:23 +01:00
Debezium Builder
3853d20f44 [maven-release-plugin] prepare for next development iteration 2023-12-21 06:52:01 +00:00
Debezium Builder
5d35e9caaa [maven-release-plugin] prepare release v2.5.0.Final 2023-12-21 06:52:01 +00:00
Roman Kudryashov
516aa87fad DBZ-7235 Add option to customize CloudEvents schema name 2023-12-20 06:53:37 +01:00
Debezium Builder
2c1def7241 [maven-release-plugin] prepare for next development iteration 2023-12-14 09:43:13 +00:00
Debezium Builder
ef8260f802 [maven-release-plugin] prepare release v2.5.0.CR1 2023-12-14 09:43:12 +00:00
Chris Cranford
2b02b3982e DBZ-4321 Rework configuration options 2023-12-13 11:27:40 -05:00
Chris Cranford
59027ed5ed DBZ-4321 New PostProcessor contract and Column Reselection 2023-12-13 11:27:40 -05:00
mfvitale
5ed16284f4 DBZ-6834 Provide INSERT/DELETE semantics for MongoDb incremental snapshot watermarking 2023-12-06 14:10:26 +01:00
mfvitale
4fedfbba03 DBZ-6834 Provide INSERT/DELETE semantics for incremental snapshot watermarking 2023-12-06 14:10:26 +01:00
Debezium Builder
0fd1c0dc9a [maven-release-plugin] prepare for next development iteration 2023-12-04 13:55:35 +00:00
Debezium Builder
3e2d75f0da [maven-release-plugin] prepare release v2.5.0.Beta1 2023-12-04 13:55:35 +00:00
“vsantonastaso”
8c1c369449 DBZ-6878 add table specific notification in initial snapshot 2023-11-29 08:32:21 +01:00
Sebastiaan Knijnenburg
036eda0c64 DBZ-6723 Expose partition number in ChangeEvent interface
Based on feedback in https://github.com/debezium/debezium-server/pull/33 this
commit adds the partition() method to the ChangeEvent interface and implements
it in the EmbeddedEngineChangeEvent. This allows reading the assigned partition
for an event in downstream processors, for example in custom Sinks that need
the assigned partition for routing purposes.

Link:  https://issues.redhat.com/browse/DBZ-6723
2023-11-28 09:38:18 +01:00
Roman Kudryashov
1992c1e7e4 DBZ-7159 Fail fast during deserialization if a value is not a CloudEvent 2023-11-23 14:19:37 +01:00
Vojtech Juranek
60939c8965 DBZ-7099 Provide default value for PeriodicCommitOffsetPolicy
In 7b4cf1901 deprecated `io.debezium.embedded.spi.OffsetCommitPolicy`,
which provided also constructor for `Configuration`, was removed.
This constructor was actually used for creating `OffsetCommitPolicy`.
`Configuration` provides default values for options which are not
explicitly set, while `Properties` based constructor cannot do that
and therefore with this switch the default value for
`offset.flush.interval.ms` is now missing.

As debezium-api package has no knowledge about `Configuration`
interface, which is part of debezium-core, and thus about the default
values, specify default value direcly in the `PeriodicCommitOffsetPolicy`
class.
2023-11-14 06:58:11 +01:00
Debezium Builder
1521445908 [maven-release-plugin] prepare for next development iteration 2023-11-10 10:26:05 +00:00
Debezium Builder
6c6f6e9138 [maven-release-plugin] prepare release v2.5.0.Alpha2 2023-11-10 10:26:05 +00:00
Vojtech Juranek
cf7f0f3801 DBZ-7110 Use better name for DebeziumEngine.Builder implementation 2023-11-07 10:13:50 +01:00
Vojtech Juranek
c34cf6920c DBZ-7110 Remove deprecated EmbeddedEngine interface 2023-11-07 10:13:50 +01:00
Vojtech Juranek
11d2ff0b9b DBZ-7007 Use better name for auxiliary variable detecing if engine is running 2023-11-06 10:50:21 +01:00
Vojtech Juranek
89054bf8a8 DBZ-7007 Introduce TestingDebeziumEngine and switch to it in AbstractConnectorTest
Postgres `RecordsStreamProducerIT` reliaes on
EmebeddedEngine.runWithTask(). As this method effectively expose
engine's internal task and tests do the asserts against the state
of the task, it's hard to replace it. If we want to keep the tests,
the most simple approach seems to expose engine task in similar way
how EmbeddedEngine does that.

Add interface for testing Debezium engine, which would define minimal
set of methods which needs to be exposed by the implementing classes to
be able to run the testsuite against the Debezium engine. The number of
such methods should be as low as possible. Implementing classes would
typically act as proxies to actual `DebeziumEngine` implementations.

Add `TestingDebeziumEngine` implementation for `EmbeddedEngine` and
switch to `TestingDebeziumEngine` in `AbstractConnectorTest`.
2023-11-06 10:50:21 +01:00