Commit Graph

377 Commits

Author SHA1 Message Date
Horia Chiorean
8e14f150db DBZ-3 Adds the structure for a Postgres connector which uses a Debezium Postgres docker image that has the decoderbufs plugin enabled to read WAL changes 2016-12-27 14:44:29 +02:00
Randall Hauch
49e6231b69 DBZ-151 Removed integration module from normal build 2016-12-21 17:11:15 -06:00
Randall Hauch
928db59807 DBZ-170 Changed the MongoDB connector’s connection logic 2016-12-21 17:06:58 -06:00
Randall Hauch
a960d25ea7 Updated changelog for the 0.3.6 release. 2016-12-21 15:54:51 -06:00
Randall Hauch
d808f7e6a0 Merge pull request #159 from rhauch/dbz-170
DBZ-170 Changed the MongoDB connector’s connection logic
2016-12-21 15:19:00 -06:00
Randall Hauch
443edc358d DBZ-170 Changed the MongoDB connector’s connection logic
This change alters the way the MongoDB connects to the various servers in a cluster. Previously, the ConnectionContext constructor currently set up the MongoDB client with credentials for the `admin` and `config` databases, and apparently the client eagerly performs authentication against all databases passed in, rather than doing this lazily as DBs are use.

Instead, the code no longer sets up the credentials for the `config` database and instead only sets up credentials for the `admin` database for authentication and authorization. This works as long as the user specified in the connector configuration can read the `config` database.

Several other changes were made to improve the error handling and reporting when the replica set information cannot be read from the `config` database.
2016-12-21 14:27:42 -06:00
Randall Hauch
971df374a8 Merge pull request #158 from rhauch/dbz-164
DBZ-164 Improved MySQL snapshot reader logic
2016-12-21 08:49:08 -06:00
Randall Hauch
e60839e76b DBZ-164 Improved MySQL snapshot reader logic
Added more logic to the snapshot reader to better handle errors when reading the list of table names in each database. Now, any errors with a single database (e.g., some of the not-quite-a-database names described in the JIRA issue) will cause the snapshot reader to simply skip that database name and continue on (with proper logging).

This change also quotes all of the database and table names when used in SQL statements.
2016-12-20 22:03:46 -06:00
Randall Hauch
fd7e152852 Merge pull request #142 from rhauch/dbz-151
DBZ-151 Added new integration test framework
2016-12-20 17:53:16 -06:00
Randall Hauch
ab1140ef70 Merge pull request #155 from rhauch/dbz-169
DBZ-169 MySQL connector support for ON UPDATE clauses
2016-12-20 17:48:06 -06:00
Randall Hauch
fe44380d4c Merge pull request #154 from rhauch/dbz-168
DBZ-168 MySQL connector ignores XA binlog events
2016-12-20 17:47:57 -06:00
Randall Hauch
07a7858792 Merge pull request #156 from rhauch/dbz-167
DBZ-167 Corrected the MongoUtil class
2016-12-20 17:47:45 -06:00
Randall Hauch
b578afc9aa Merge pull request #157 from rhauch/dbz-152
DBZ-152 Enabled MySQL connector to skip table count checks during snapshot
2016-12-20 17:47:33 -06:00
Randall Hauch
a9a84cb6aa DBZ-152 Enabled MySQL connector to skip table count checks during snapshot
Change the MySQL connector’s `min.row.count.to.stream.results` configuration property to accept a value of 0, which signifies that all `SELECT COUNT(*) FROM tableA` queries should be skipped and instead all results should be streamed.
2016-12-20 17:40:57 -06:00
Randall Hauch
c1a26ee261 DBZ-167 Corrected the MongoUtil class
Corrected the `onCollection` utility method in the `MongoUtil` class to pass in the collection name rather than the database name.
2016-12-20 16:24:07 -06:00
Randall Hauch
046702d959 DBZ-169 MySQL connector support for ON UPDATE clauses
Corrected the MySQL DDL parser to support `ON UPDATE NOW()` clauses in addition to `ON UPDATE CURRENT_TIMESTAMP`.
2016-12-20 16:19:18 -06:00
Randall Hauch
09f87cf190 DBZ-168 MySQL connector ignores XA binlog events
MySQL 5.7.7 introduced new behavior for handling XA events in the binlog. See the [MySQL documentation|http://dev.mysql.com/doc/refman/5.7/en/xa-restrictions.html] for details. This PR changes the binlog reader so that `XA …` statements appearing in the binlog are ignored altogether.
2016-12-20 15:32:44 -06:00
Randall Hauch
5dceb05f69 DBZ-151 Additional changes to improve test framework and MySQL integration tests 2016-12-20 10:58:56 -06:00
Randall Hauch
08e32a4a8b DBZ-151 Added multiple integration test modules to test various MySQL versions and configurations.
These new modules run during the '-Passembly' profile and use the new integration test framework that compares all
output produced by a connector to expected results that were previously recorded and verified. These integration test modules
can be run manually with a simple build of those modules or their parent; only the top-level 'integration-tests' module is run
during the assembly profile during builds of the entire codebase.
2016-12-20 09:18:10 -06:00
Randall Hauch
a3bece4472 DBZ-151 Added new integration test framework for easily comparing output of connectors to expected results. 2016-12-20 09:18:09 -06:00
Randall Hauch
d1d21166b9 Merge pull request #152 from rhauch/dbz-166
DBZ-166 Corrected shutdown logic of MySQL connector
2016-12-20 08:48:03 -06:00
Randall Hauch
0851d8280c DBZ-166 Corrected shutdown logic of MySQL connector
The MySQL connector uses several threads, so previously upon connector shutdown these threads were simply cancelled. This is fine for the binlog reader (which can stop at any moment), but is a poor approach for the snapshot as we didn’t always properly release the database resources and also didn’t complete the writing of the DDL history.

With this change, the snapshot reader stops in a very controlled manner, basically by having the 10-step snapshot procedure frequently check whether the reader is to continue working, and to completely avoid thread interruption altogether. And, the snapshot procedure will always clean up its database resources (locks, transactions, etc.), even if the procedure is stopped before completion.

This change also refactors how the snapshot and binlog reader are managed. This is no longer done in the MySqlConnectorTask class (which is busy enough), but rather the logic has been encapsulated in a new `ChainedReader` that makes use of a new `Reader` interface. This makes testing of `ChainedReader` easier, and ensure that `ChainedReader` relies only upon the primary methods of `Reader` rather than upon `AbstractReader`. `ChainedReader` handles multiple readers generically, and ensures that when stopped the readers are all handled correctly and completely process all records, yet avoid accidentally starting a subsequent reader(s) when stopping the previous reader.
2016-12-15 10:55:18 -06:00
Randall Hauch
ca0c79f307 Merge pull request #151 from rhauch/dbz-163b
DBZ-163 Stop Travis-CI’s MySQL and PostgreSQL instances
2016-12-08 08:46:14 -06:00
Randall Hauch
3b30d7d046 Merge pull request #150 from rhauch/dbz-161
DBZ-161 Corrected MySQL connector logic when no GTIDs are used
2016-12-08 08:31:27 -06:00
Randall Hauch
923371fe22 DBZ-163 Stop Travis-CI’s MySQL and PostgreSQL instances
Recently, Travis-CI changed the sudo enabled Trusty images that we use in our builds to by-default install and run MySQL 5.6 and Postgres 9.6. This commit stops those services in the `before_install` step of our Travis-CI builds.
2016-12-08 08:15:33 -06:00
Randall Hauch
e3e66bf960 DBZ-161 Corrected MySQL connector logic when no GTIDs are used
Corrected the logic of the MySQL connector when getting the server’s GTID set. Previously, this logic failed if GTIDs are not used.
2016-12-08 08:09:52 -06:00
Randall Hauch
7ee444546b Merge pull request #147 from DennisPersson/DBZ-142
DBZ-142 Handle national character set columns in DDL parser
2016-12-07 09:40:25 -06:00
Dennis Persson
acd7bd8fa5 DBZ-142 Handle national character set columns in DDL parser 2016-12-07 07:38:30 +01:00
Randall Hauch
07fe71385c Merge pull request #149 from rhauch/dbz-162
DBZ-162 Corrected DDL parsing of MySQL functions
2016-12-06 17:41:58 -06:00
Randall Hauch
c762a221b7 DBZ-162 Corrected DDL parsing of MySQL functions
The MySQL DDL parser was not properly consuming function declarations. For functions, the parser consumes the entire statement without handline the various expressions within the function declaration, but the parser was not properly finding the end of the statement and instead was continuing to try to consume values beyond the end of the statement.

Specifically, when the parser consumes a `BEGIN`, it looks for a corresponding `END`. However, if it encountered an `END IF`, the `IF` plus any remaining tokens were left on the token stream and unprocessed. This confused the parser, which keep looking for statements and ultimately ended with a `No more content` error.

This case was replicated in integration tests, and the code fixed to properly find the end of the statements.
2016-12-06 17:34:52 -06:00
Randall Hauch
c72242eeb0 Merge pull request #145 from sherafpm/bugfix/DBZ-160
DBZ-160 - Issue while parsing create table script with ENUM type and default value 'b'
2016-12-06 14:21:23 -06:00
Randall Hauch
405f343437 Merge pull request #146 from rhauch/dbz-163
DBZ-163 Changed Travis-CI build to skip the install dependencies step
2016-12-05 16:54:09 -06:00
Randall Hauch
eedc4fba00 DBZ-163 Corrected assembly profile in build
The Travis-CI builds run the Maven build using the `assembly` profile, and this has been failing quite a bit lately.

The first problem appears to be that the Travis-CI environment recently changed to have port 3306 taken, which means that our build fails to start any Docker containers for MySQL that attempt to use this port. A simple fix is to use different ports for the assembly build.

However, trying to change the port numbers for some of the profiles caused a lot of problems, and to correct these required refactoring how the properties are set. The Docker Maven plugin is now configured with separate properties that are set once (depending upon the profile) to determine the port assignments of the various Docker containers. The Failsafe plugin executions then use these Maven properties when setting the system variables (e.g., `database.host`) needed in the integration tests. This appears to have worked, but it still is a bit fragile. For example, the assembly profile defines several Failsafe executions, and during this profile these should be the only executions run; however, if not all the properties are set properly, the build seems to also run the default Failsafe execution in addition to the other `assembly` profile executions. (I think properties can’t only be defined in the execution, but need to also be defined in the Failsafe configuration.)

The “alternative” MySQL Docker images were removed, since they basically should not provide any different behavior than the `mysql/mysql-server` images we normally used. The extra containers required a lot more resources to run and dramatically increased the complexity of the build.

A few other trivial changes were made.
2016-12-05 16:37:59 -06:00
Randall Hauch
2b2bf693d7 DBZ-163 Changed Travis-CI build to skip the install dependencies step 2016-12-02 15:43:57 -06:00
Sherafudheen PM
ee52219736 DBZ-160 - Issue while parsing create table script with ENUM type and default value 'b' 2016-12-02 17:42:44 +05:30
Randall Hauch
09545e0ec9 Merge pull request #143 from rhauch/dbz-157
DBZ-157 Upgraded Docker Maven plugin
2016-11-22 09:38:10 -06:00
Randall Hauch
0bf3b4c9f3 DBZ-157 Upgraded Docker Maven plugin
Upgraded the Docker Maven plugin to 0.18.1, which required changing our use of the `docker.image` to `docker.filter` (per the [changes in 0.17.1](https://github.com/fabric8io/docker-maven-plugin/blob/master/doc/changelog.md)).
2016-11-22 09:23:07 -06:00
Randall Hauch
e5e56d0252 Merge pull request #141 from hchiorean/DBZ-156
DBZ-156 Adds better error handling to the EmbeddedEngine
2016-11-18 08:14:56 -06:00
Horia Chiorean
968cf62b23 DBZ-156 Adds better error handling to the EmbeddedEngine 2016-11-18 11:04:00 +02:00
Randall Hauch
0dc86dbdae Merge pull request #140 from hchiorean/DBZ-156
DBZ-156 Updates EmbeddedEngine to better handle exceptional cases and provide more feedback during startup
2016-11-17 19:15:07 -06:00
Horia Chiorean
506457c13b DBZ-156 Updates EmbeddedEngine to better handle exceptional cases and provide more feedback during startup
It also updates  EmbeddedEngine to use the Kafka commit callbacks introduced after 0.10 and updates AbstractConnectorTest to better synchronize with the embedded engine
2016-11-17 19:18:07 +02:00
Randall Hauch
a82ae5691b Reduce the log verbosity of the MySQL tests 2016-11-14 13:41:10 -06:00
Randall Hauch
bfbf485123 Upgrade MySQL JDBC driver 2016-11-14 13:41:01 -06:00
Randall Hauch
86476faffa Updated release notes for the 0.3.5 release. 2016-11-14 12:41:42 -06:00
Randall Hauch
d80bc1bfd7 DBZ-153 MySQL connector supports enum and set values with parentheses
Changed the MySQL connector to support ENUM and SET literals with parentheses.
2016-11-14 12:22:08 -06:00
Randall Hauch
8a52cda0dc DBZ-150 Changed the order of events when a row's key is changed. 2016-11-09 14:42:43 -06:00
Randall Hauch
b0ded5f383 DBZ-147 Added ability to treat MySQL DECIMAL as double
By default the MySQL connector handles `DECIMAL` and `NUMERIC` columns using `java.math.BigDecimal` values and describing them using the `org.apache.kafka.connect.data.Decimal` schema type, which serializes the values to a binary form.

This change adds a configuration option that will keep the default behavior, but will instead allow handling `DECIMAL` adn `NUMERIC` values as Java `double` and a schema type of `FLOAT64`.
2016-11-09 11:27:09 -06:00
Randall Hauch
ea5f7983c7 DBZ-144 Corrected MySQL connector restart
Added tests to verify whether the connector is properly restarting in the binlog when previously the connector failed or stopped in the middle of a transaction. The tests showed that the connector is not able to properly start when using or not using GTIDs, since restarting from an arbitrary binlog event causes problems since the TABLE_MAP events for the affected tables are skipped.

The logic was changed significantly to record in the offsets the binlog coordinates at the start of the transaction, which should work whether or not GTIDs are used. Upon restart, the connector may have to re-read the events that were previously processed, but now the offset also includes the number of events that were previously processed so that these can be skipped upon restart.

This has an unforunate side effect since the offsets capture a transaction was completed only when it generates a source record for the subsequent transaction. This is because the connector generates source records (with their offsets) for the binlog events in the transaction before the transaction's commit is seen. And, since no additional source records are produced for the transaction commit, the recorded offsets will show that the prior transaction is complete and that all of the events in the subsequent transaction are to be skipped. Thus, upon restart the connector has to re-read (but ignore) all of the binlog events associated with the completed transaction. This shouldn’t be a problem, and will only slow restarts for very large transactions.
2016-11-09 08:11:41 -06:00
Randall Hauch
0d2acfd0a6 DBZ-149 Corrected type of BINARY column
The MySQL connector (or rather the DDL parser used in the connector) improperly assumed a `CHAR` JDBC type (and Avro schema `STRING` type) for MySQL columns of type `BINARY`. This corrects the error.
2016-11-08 17:41:01 -06:00
Randall Hauch
c66037f52a Merge pull request #133 from rhauch/dbz-148
DBZ-148 Corrected timestamp check in test case to account for DST
2016-11-08 16:00:01 -06:00