Commit Graph

5292 Commits

Author SHA1 Message Date
Randall Hauch
af94fa8759 DBZ-193 MySQL DDL parser handles FULLTEXT index
Corrected the MySQL DDL parser to correctly handle `FULLTEXT` indexes within a `CREATE TABLE` statement. The parser was incorrectly using `canConsume(…)` with a list of options instead of `canConsumeAnyOf(…)`.
2017-02-10 15:49:20 -06:00
Randall Hauch
9a4a177004 DBZ-188 Corrected JavaDoc 2017-02-10 15:39:22 -06:00
Randall Hauch
333cf8e010 Updated list of contributors 2017-02-10 15:37:23 -06:00
Randall Hauch
8304625d7c Merge pull request #185 from dasl-/DBZ-188
DBZ-188: Allow a debezium mysql connector to filter production of DML…
2017-02-10 15:17:09 -06:00
Randall Hauch
1ed9f66308 Merge pull request #187 from rhauch/dbz-192
DBZ-192 Corrected links to release notes to public URLs
2017-02-10 15:12:07 -06:00
Randall Hauch
01ee528526 DBZ-192 Corrected links to release notes to public URLs 2017-02-10 15:11:31 -06:00
dleibovic
aa50bfe71a DBZ-188: Allow a debezium mysql connector to filter production of DML events into kafka by the mysql UUID of the event
With GTIDs enabled, each transaction in the binlog contains a GTID event, which gives us access to the GTID of the transaction. The GTID has the following format: source_id:transaction_id, where source_id is the UUID of the mysql server the transaction was written to.

I propose to allow a debezium instance to be configured with a UUID pattern to check against before producing DML events into Kafka. Debezium would produce a DML event into kafka if and only if the UUID in the event's GTID matches the pattern with which debezium was configured.

The configuration for the UUID patterns will make use of the existing gtid.source.includes and gtid.source.excludes options. The DML event filtering will only be performed if the new option gtid.source.filter.dml.events is true.
2017-02-10 14:14:10 -05:00
Randall Hauch
3aea39397f Merge pull request #183 from rhauch/dbz-188-uuid-filters
DBZ-188 More efficient GTID source filters for MySQL Connector
2017-02-10 12:14:05 -06:00
Randall Hauch
d2986710a5 DBZ-188 More efficient GTID source filters for MySQL Connector
Changed the GTID source filters in the MySQL connector to be far more efficient when the filters specify literal UUIDs rather than regex patterns. In these cases, the predicate just checks whether a supplied value is in a hash set, and no regular expression patterns are used.

The GTID source filters can still be a combination of UUID literals and regular expressions, and the predicate will use the best implementation for each. For example, if the filters include all UUID literals, then regular expressions will never be used.
2017-02-10 11:34:24 -06:00
Randall Hauch
8c60c29883 [maven-release-plugin] prepare for next development iteration 2017-02-07 14:22:12 -06:00
Randall Hauch
20134286e9 [maven-release-plugin] prepare release v0.4.0 2017-02-07 14:22:11 -06:00
Randall Hauch
e5b42e4bc1 Updated changelog for 0.4.0 2017-02-07 13:35:30 -06:00
Randall Hauch
6da2a8a1b2 Merge pull request #178 from rhauch/dbz-185
DBZ-185 MySQL’s database history now filters GTID sources
2017-02-07 13:25:32 -06:00
Randall Hauch
e8414ad6c1 Merge pull request #181 from rhauch/dbz-187
DBZ-187 Upgrade MongoDB server and Java driver versions
2017-02-07 12:56:52 -06:00
Randall Hauch
896dd35bcb DBZ-187 Upgrade MongoDB server and Java driver versions
Upgraded the MongoDB server to 3.2.12 and the Java driver to 3.4.2.
2017-02-07 12:49:50 -06:00
Randall Hauch
7303bdfccb Merge pull request #180 from rhauch/dbz-186
DBZ-186 Upgraded MySQL binary log client library
2017-02-07 12:43:06 -06:00
Randall Hauch
9ae50b3691 DBZ-186 Upgraded MySQL binary log client library
Upgraded Shyiko’s MySQL binary log client library from 0.8.0 to 0.9.0 to get new timeout behavior when it opens sockets and fix for JSON array processing.
2017-02-07 12:34:12 -06:00
Randall Hauch
403fee1375 DBZ-185 MySQL’s database history now filters GTID sources
Corrects how the MySQL connector reloads database history to take into account the included and excluded GTID sources. This only affects a connector configured to capture changes from _multiple_ MySQL database servers when GTID sources are explicitly excluded or included.
2017-02-07 11:21:22 -06:00
Randall Hauch
6cbc78c5c4 Merge pull request #179 from rhauch/dbz-173
DBZ-173 Upgraded Confluent Platform libraries used in test cases
2017-02-07 11:20:43 -06:00
Randall Hauch
65951308f7 DBZ-173 Upgraded Confluent Platform libraries
Some of our test cases verify (de)serialization using the Avro Converter, which is included in the Confluent Platform. This commit upgrades the Confluent Platform to version 3.1.2, which matches Kafka 0.10.1.1.
2017-02-07 11:18:21 -06:00
Randall Hauch
f10970a4c9 Merge pull request #177 from rhauch/dbz-140
DBZ-140 Improved locking logic to support RDS
2017-02-06 14:33:00 -06:00
Randall Hauch
bb0800ca3a DBZ-140 Improved locking logic to support RDS
Improved the MySQL connector's logic to better handle Amazon RDS that does not allow giving user `SUPER` privileges. As before, the connector starts a transaction and attempts to get a global read lock via `FLUSH TABLES WITH READ LOCK` to prevent writes to the database so that the binlog position can be accurately read _and_ the table schemas can be read without interference from other clients. Once that is done, the connector releases the global read lock and continues in the same transaction to read all table rows. This means that our snapshot is consistent, but we maintain the global read lock for a very short period of time.

Amazon's RDS and Aurora are hosted MySQL instances that do not allow users to have the `SUPER` privilege, which means the user cannot get a global read lock. In this case, the connector detects this error, continues to read the database and table names (without any lock), and _then_ uses `FLUSH TABLES <tableName> WITH READ LOCK` on each table that satisfies the filters to prevent changes from other clients. The connector then reads the table schemas, reads _all_ table rows, commits the transaction, and _finally_ releases the table locks.

Therefore, there are two very different behaviors/requirements when the user can't obtain a global read lock because of lack of privilege, like on RDS:

# The RDS user that the connector makes use of must also have the `LOCK TABLES` privilege; without it the connector will fail during the snapshot.
# The connector must hold the table read locks _until it has completed reading all of the tables_, since release the table locks using `UNLOCK TABLES` would prematurely commit our transaction and prevent us from getting a consistent snapshot. From the [MySQL documentation](https://dev.mysql.com/doc/refman/5.7/en/flush.html):
> `UNLOCK TABLES` implicitly commits any active transaction only if any tables currently have been locked with `LOCK TABLES`. The commit does not occur for `UNLOCK TABLES` following `FLUSH TABLES WITH READ LOCK` because the latter statement does not acquire table locks.
2017-02-06 13:56:55 -06:00
Randall Hauch
3ff9ca8344 Merge pull request #176 from rhauch/dbz-182
DBZ-182 Restart MongDB initial sync if necessary
2017-02-03 08:36:00 -06:00
Randall Hauch
0c17e1f972 DBZ-182 Restart MongDB initial sync if necessary
Corrected the MongoDB connector upon startup to restart an initial sync if the previously recorded offset signals that an initial sync was not completed in the prior run.

Also change the connector’s replicator to buffer the last record during an initial sync so that, upon completion of the initial sync, the last record can be updated with an offset that reflects that the initial sync was completed. This way, if the initial sync is completed but there are no other events in the oplog, the connector will still consider the initial sync as completed.
2017-02-02 15:43:18 -06:00
Randall Hauch
5490842449 Merge pull request #175 from rhauch/dbz-176
DBZ-176 Corrected MySQL DDL parser to support creating triggers with definers
2017-02-02 13:59:01 -06:00
Randall Hauch
5af316ed56 Merge pull request #174 from rhauch/dbz-184
DBZ-184 Added database and table name to change event metadata
2017-02-02 12:47:08 -06:00
Randall Hauch
74e5ba6448 DBZ-176 Corrected MySQL DDL parser to support creating triggers with definers
The MySQL DDL parser was not correclty handling `DEFINER` clauses within `CREATE TRIGGER` or `CREATE EVENT` statements. Support for `DEFINER` clauses was recently added for the various forms of `CREATE PROCEDURE`, `CREATE FUNCTION` and `CREATE VIEW` statements. These are the only kinds of statements that have the definer attribute, per the [MySQL documentation](https://dev.mysql.com/doc/refman/5.7/en/stored-programs-security.html).
2017-02-02 12:44:28 -06:00
Randall Hauch
32a88fdc6f DBZ-184 Added database and table name to change event metadata 2017-02-02 12:09:53 -06:00
Randall Hauch
6230cab90e Merge pull request #173 from rhauch/dbz-113
DBZ-113 Added MySQL threads to the event’s source metadata
2017-02-02 12:00:19 -06:00
Randall Hauch
fe17b246af DBZ-113 Added MySQL threads to the event’s source metadata
Changed the events’ `source` structure to optionally contain the identifier of the MySQL thread where appropriate. The thread is included on each `BEGIN` binlog event, so these are captured and added to all of the associated change events produced for that transaction.
2017-02-02 11:53:32 -06:00
Randall Hauch
03130d45ef DBZ-151 DBZ-171 Removed the MySQL integration tests
Maintaining these integration tests has turned out to be a nightmare, so I'm removing them from the assembly build.
2017-02-02 11:51:03 -06:00
Randall Hauch
53ad07e854 Merge pull request #172 from rhauch/dbz-174
DBZ-174 Added support for new binlog events
2017-02-01 15:36:04 -06:00
Randall Hauch
f2a65d03df DBZ-174 Added support for new binlog events
MySQL recently added additional binlog events, and this commit adds support to handle these new events by ignoring them.
2017-02-01 15:26:28 -06:00
Randall Hauch
a65ce08b62 Merge pull request #171 from rhauch/dbz-173
DBZ-173 Additional fixes to KafkaDatabaseHistory class for Kafka 0.10.1.0
2017-02-01 15:04:45 -06:00
Randall Hauch
972cfbe2c4 DBZ-173 Additional fixes to KafkaDatabaseHistory class for Kafka 0.10.1.0
The KafkaDatabaseHistory class was not behaving well in tests using my local development environment. When restoring from the persisted Kafka topic, the class would set up a Kafka consumer and see repeated messages. It is unclear whether the repeats were due to our test environment and very short poll timeouts. Regardless, the restore logic was refactored to track offsets so as to only process messages once.
2017-02-01 14:47:41 -06:00
Randall Hauch
c380305e9b Merge pull request #167 from hchiorean/DBZ-173
DBZ-173 Upgrades the Kafka artifact versions to 0.10.1.1
2017-01-27 13:48:48 -06:00
Horia Chiorean
d035c4bc8d DBZ-173 Changes the MySQL ITs to not use TZ information for expected dates and fixes the character set for parsing test files 2017-01-27 14:53:10 +02:00
Horia Chiorean
a2154d3d32 DBZ-173 Changes the MySQL ITs to use the database.hostname system property instead of always hardcoding 'localhost' 2017-01-27 09:19:57 +02:00
Horia Chiorean
7dfdef3558 DBZ-173 Upgrades the Kafka artifact versions to 0.10.1.1 2017-01-27 09:19:57 +02:00
Randall Hauch
4514026c82 Merge pull request #170 from hchiorean/DBZ-183
DBZ-183 Fixes the BinlogReader's handling of TIMESTAMP columns to correctly account for timezones
2017-01-26 13:29:39 -05:00
Horia Chiorean
031c4a1552 DBZ-183 Fixes the BinlogReader's handling of TIMESTAMP columns to correctly account for timezones 2017-01-25 16:39:36 +02:00
Randall Hauch
4cd71394a8 Merge pull request #169 from rhauch/fix-mysql-it-results
Fixed MySQL integration test expected results
2017-01-23 11:10:00 -06:00
Randall Hauch
ab62831f3b Fixed MySQL integration test expected results
A recent change to MySQL added quoted identifiers from DDL statements (e.g., resulting from `SHOW CREATE TABLE <quotedIdentifier>`), so the expected results were changed to reflect this. Also, the `pos` field is quite brittle and changes with many MySQL version upgrades in the Docker images, so those fields are now ignored during the integration tests.
2017-01-23 10:57:47 -06:00
Randall Hauch
c27009b344 Merge pull request #164 from rhauch/dbz-179
DBZ-179 Changed PostgreSQL connector codebase to fix JavaDoc errors
2017-01-20 13:30:06 -06:00
Randall Hauch
706d030383 Merge pull request #168 from rhauch/fix-mysql-int-tests
Fixed MySQL integration tests to handle new log events from new MySQL versions
2017-01-20 12:51:35 -06:00
Randall Hauch
4c803c1fdf Fixed MySQL integration tests to handle new log events from new MySQL versions 2017-01-20 12:30:56 -06:00
Randall Hauch
f0db8d1b1f DBZ-179 Corrected JavaDoc in PostgreSQL connector
Corrected the JavaDoc and removed trailing spaces in the PostgreSQL connector code.
2017-01-20 11:51:17 -06:00
Randall Hauch
e11f242b00 DBZ-179 Moved generated source for Protobuf
The project requires that all JavaDoc for public methods exist and are valid (e.g., have all @param, @return and @throws to match the signature). However, the generated Java source for Protobuf contain numerous JavaDoc errors relative to these settings. This causes lots of errors inside Eclipse (and probably other IDEs), but ignoring/disabling the JavaDoc errors leads to improper JavaDoc (fixed in next commit). By moving the generated Protobuf source code to a separate directory (e.g., `generated-sources`), the IDEs will automatically discover the directory and the user can ignore any compiler and JavaDoc errors/warnings for those files while keeping the more strict JavaDoc checking enabled for the rest of the code.
2017-01-20 11:50:08 -06:00
Randall Hauch
93d34e3177 Merge pull request #166 from hchiorean/DBZ-177
DBZ-177 Changes the way the PostgreSQL connector loads the JDBC driver, to use the connector's classloader
2017-01-18 11:14:32 -06:00
Horia Chiorean
c024d0789b DBZ-177 Changes the way the PostgreSQL connector loads the JDBC driver, to use the connector's classloader 2017-01-18 17:59:20 +02:00