* Keeping indentation formatting for JSON
Co-authored-by: Gunnar Morling <gunnar.morling@googlemail.com>
Co-authored-by: Chris Cranford <chris@hibernate.org>
It appears there is an intent to restore the interruption flag
when catching InterruptedException. However the old code was just
checking the current state of the flag instead of raising it.
This commit does a few things:
- Refactors snapshot modes to be encapsulated by an interface and
to only use that interface in determining when to snapshot and in
determing the type of the `RecordProducer` interface to instantiate
- Refactors the configuration of existing snapshot modes to tie the
existing snapshot modes to their aligned implementation
- Adds a new snapshot.mode, custom, and a new configuration option to
specify a custom implementation that will be loaded by the class loader
- Changes the visibility of some classes to allow for custom snapshot
modes to get enough context to make an informed choice
- Adds some metadata about slots (the catalog_xmin) to give a full idea
of the state of slots which can be useful in implementing snapshot
modes (which is also configurable, as it can add some overhead)
Together, these changes allow for a much broader flexibility got end
users to implement a snapshot mode that can do more advanced snapshots,
such as partial recovery or for partial snapshots for tables where not
all records are needed.
This could also be seen as superseeding the
`snapshot.select.statement.overrides` to allow for users to dynamically
build queries based on the state of the slot and the offsets consumed.
This introduces a new API to the EmbeddedEngine, the ChangeConsumer,
which gives the user a more flexible option for consuming changes by
exposing groups of records as well as the ability to control the
comitting of those records.
This remainds completely backwards compatible with the old API as the
ChangeConsumer wraps the existing Consumer interface with a default
implementation.
* Renaming ConfigurationHelper to Instantiator
* Doc improvements and typo fixes
* Bringing getInstance() methods into consistent order
* Raising exception instead of logging error if instantion fails
Changed the events’ `source` structure to optionally contain the identifier of the MySQL thread where appropriate. The thread is included on each `BEGIN` binlog event, so these are captured and added to all of the associated change events produced for that transaction.
The version of the DB server required for this to work is at least 9.4. To be able to stream logical changes, the code relies on enhancements to the JDBC driver which are not yet public. Therefore, the current codebase includes the sources for the JDBC driver.
The commit also updates the general DBZ build system for:
* custom checkstyle package exclusions - required by the Postgres driver the protobuf code for now
* adds support for debugging Surefire and Failsafe
The Travis-CI builds run the Maven build using the `assembly` profile, and this has been failing quite a bit lately.
The first problem appears to be that the Travis-CI environment recently changed to have port 3306 taken, which means that our build fails to start any Docker containers for MySQL that attempt to use this port. A simple fix is to use different ports for the assembly build.
However, trying to change the port numbers for some of the profiles caused a lot of problems, and to correct these required refactoring how the properties are set. The Docker Maven plugin is now configured with separate properties that are set once (depending upon the profile) to determine the port assignments of the various Docker containers. The Failsafe plugin executions then use these Maven properties when setting the system variables (e.g., `database.host`) needed in the integration tests. This appears to have worked, but it still is a bit fragile. For example, the assembly profile defines several Failsafe executions, and during this profile these should be the only executions run; however, if not all the properties are set properly, the build seems to also run the default Failsafe execution in addition to the other `assembly` profile executions. (I think properties can’t only be defined in the execution, but need to also be defined in the Failsafe configuration.)
The “alternative” MySQL Docker images were removed, since they basically should not provide any different behavior than the `mysql/mysql-server` images we normally used. The extra containers required a lot more resources to run and dramatically increased the complexity of the build.
A few other trivial changes were made.
It also updates EmbeddedEngine to use the Kafka commit callbacks introduced after 0.10 and updates AbstractConnectorTest to better synchronize with the embedded engine
Added tests to verify whether the connector is properly restarting in the binlog when previously the connector failed or stopped in the middle of a transaction. The tests showed that the connector is not able to properly start when using or not using GTIDs, since restarting from an arbitrary binlog event causes problems since the TABLE_MAP events for the affected tables are skipped.
The logic was changed significantly to record in the offsets the binlog coordinates at the start of the transaction, which should work whether or not GTIDs are used. Upon restart, the connector may have to re-read the events that were previously processed, but now the offset also includes the number of events that were previously processed so that these can be skipped upon restart.
This has an unforunate side effect since the offsets capture a transaction was completed only when it generates a source record for the subsequent transaction. This is because the connector generates source records (with their offsets) for the binlog events in the transaction before the transaction's commit is seen. And, since no additional source records are produced for the transaction commit, the recorded offsets will show that the prior transaction is complete and that all of the events in the subsequent transaction are to be skipped. Thus, upon restart the connector has to re-read (but ignore) all of the binlog events associated with the completed transaction. This shouldn’t be a problem, and will only slow restarts for very large transactions.