Commit Graph

680 Commits

Author SHA1 Message Date
Jiri Pechanec
9d904a1150 DBZ-858 Upgrade to Kafka 2.0.0 2018-08-20 15:56:23 +02:00
Jenkins user
e00dad127f [maven-release-plugin] prepare for next development iteration 2018-07-26 08:00:12 +00:00
Jenkins user
16bfd5c700 [maven-release-plugin] prepare release v0.9.0.Alpha1 2018-07-26 08:00:12 +00:00
Jiri Pechanec
16666c2c58 DBZ-829 Upgrade to Kafka 1.1.1 2018-07-26 09:28:25 +02:00
Jenkins user
f9b8d830a8 [maven-release-plugin] prepare for next development iteration 2018-07-11 07:36:30 +00:00
Jenkins user
290ded678f [maven-release-plugin] prepare release v0.8.0.Final 2018-07-11 07:36:29 +00:00
Jenkins user
033db6659d [maven-release-plugin] prepare for next development iteration 2018-07-04 07:07:44 +00:00
Jenkins user
696f35f2c0 [maven-release-plugin] prepare release v0.8.0.CR1 2018-07-04 07:07:44 +00:00
Jiri Pechanec
d4c9d24b22 DBZ-779 Detect mongo replicator init completion 2018-07-03 12:24:06 +02:00
Jiri Pechanec
142e68e060 DBZ-742 Basic perf comparison of MySQL parsers 2018-06-28 15:16:23 +02:00
Gunnar Morling
bf7a5018ca Setting POM version back to 0.8.0-SNAPSHOT 2018-06-22 12:21:03 +02:00
Jenkins user
db42e4657a [maven-release-plugin] prepare for next development iteration 2018-06-21 14:07:45 +00:00
Jenkins user
c4b8ecaa99 [maven-release-plugin] prepare release v0.8.0.Beta1 2018-06-21 14:07:45 +00:00
Gunnar Morling
96c59a1568 DBZ-20 Moving debezium-connector-oracle from main to incubator repo 2018-06-20 13:05:37 +02:00
Gunnar Morling
d7e196a18e DBZ-20 Initial import of Oracle connector based on XStream 2018-06-20 13:05:37 +02:00
rkuchar
1df0b2033c DBZ-252 Create new debezium-ddl-parser module 2018-06-15 11:42:23 +02:00
Jiri Pechanec
0f0a5d4cd4 DBZ-687 Kafka 1.1.0 2018-05-14 09:37:33 +02:00
Jiri Pechanec
3e9489741d DBZ-529 Upgrade to MongoDB 3.6, compatibility testing 2018-03-29 15:47:59 +02:00
Jenkins user
f4e151b23a [maven-release-plugin] prepare for next development iteration 2018-03-20 08:14:19 +00:00
Jenkins user
93b3252332 [maven-release-plugin] prepare release v0.7.5 2018-03-20 08:14:19 +00:00
Matthias Wessendorf
48ac9b25d4 Using latest Kafka libs 2018-03-07 10:51:19 +01:00
Jenkins user
daf27207be [maven-release-plugin] prepare for next development iteration 2018-03-07 08:31:07 +00:00
Jenkins user
9c73774928 [maven-release-plugin] prepare release v0.7.4 2018-03-07 08:31:07 +00:00
Jenkins user
6d0cd88e12 [maven-release-plugin] prepare for next development iteration 2018-02-15 04:15:34 +00:00
Jenkins user
7d1e1a989e [maven-release-plugin] prepare release v0.7.3 2018-02-15 04:15:34 +00:00
Jenkins user
04624341f5 [maven-release-plugin] prepare for next development iteration 2018-01-25 09:39:44 +00:00
Jenkins user
898f6884e1 [maven-release-plugin] prepare release v0.7.2 2018-01-25 09:39:44 +00:00
Jenkins user
6bb34b42f9 [maven-release-plugin] prepare for next development iteration 2017-12-20 07:15:12 +00:00
Jenkins user
16dcd4c980 [maven-release-plugin] prepare release v0.7.1 2017-12-20 07:15:12 +00:00
Jenkins user
5e09932cb9 [maven-release-plugin] prepare for next development iteration 2017-12-15 05:10:23 +00:00
Jenkins user
6c1d61e03b [maven-release-plugin] prepare release v0.7.0 2017-12-15 05:10:23 +00:00
Andras Istvan Nagy
cc7459f4fb DBZ-349 removed jackson databind dependency originally introduced for DBZ-349 2017-12-13 12:34:30 +01:00
Andras Istvan Nagy
631c518d8e DBZ-349 Better support for large append-only tables by making the snapshotting process restartable 2017-12-13 12:34:30 +01:00
Jiri Pechanec
be52348bf1 DBZ-492 Rebase to Confluent 4.0.0 2017-12-01 09:57:39 +01:00
Gunnar Morling
f28f8b41a8 DBZ-285 Updating Confluent platform version to 3.3.0; it doesn't exactly match Kafka 1.0.0, but its only used for a test dependency so that's alright 2017-11-13 05:55:29 +01:00
Gunnar Morling
5fbe742be8 DBZ-285 Specifying scope of dependencies in the individual POMs for the sake of comprehensibility 2017-11-10 16:48:32 +01:00
Gunnar Morling
580647b226 DBZ-285 Making more dependencies "provided" 2017-11-10 16:33:02 +01:00
Ewen Cheslack-Postava
8826669b43 DBZ-285: Use provided or test dependencies for Connect and Kafka dependencies 2017-11-04 12:01:24 -07:00
Jiri Pechanec
a6bd883857 DBZ-432 Rebased to Kafka 1.0.0 2017-11-03 11:06:18 +01:00
Gunnar Morling
38bda6625a DBZ-416 Removing unneccessary configuration of build helper plug-in;
It would have been needed for the Postres module only anyways. But it seems the generator plug-in is adding that source path automatically to the compilation already, so it's not needed at all.
2017-10-26 15:23:02 +02:00
Jiri Pechanec
130413e419 DBZ-398 Upgraded binlog connector 2017-10-20 08:43:25 +02:00
Jiri Pechanec
9f8c713a7b DBZ-227 Surefire and failsafe plugins configured only once 2017-10-19 21:36:26 +02:00
Gunnar Morling
53aa3d5c39 DBZ-227 Enabling Debezium to be built on Java 9 2017-10-19 21:36:26 +02:00
Jiri Pechanec
8837d6900c DBZ-367 Kafka version promoted to 0.11.0.1 2017-10-13 11:17:52 +02:00
Jenkins user
75937711fa [maven-release-plugin] prepare for next development iteration 2017-09-21 04:42:02 +00:00
Jenkins user
a89b9332e4 [maven-release-plugin] prepare release v0.6.0 2017-09-21 04:42:02 +00:00
Jiri Pechanec
e4bc6670c8 DBZ-305 Rebase build process against Kafka 0.11 2017-08-17 18:51:01 +02:00
Jenkins user
214696ef0c [maven-release-plugin] prepare for next development iteration 2017-08-17 11:51:05 +00:00
Jenkins user
c867e6fea6 [maven-release-plugin] prepare release v0.5.2 2017-08-17 11:51:05 +00:00
Gunnar Morling
a8d1817c22 [maven-release-plugin] prepare for next development iteration 2017-06-09 16:14:31 +00:00
Gunnar Morling
3f512aace7 [maven-release-plugin] prepare release v0.5.1 2017-06-09 16:14:31 +00:00
Gunnar Morling
8e274de33d Adding myself as a developer to pom.xml 2017-06-09 18:04:16 +02:00
Gunnar Morling
5630b61be6 DBZ-222 dependency clean-up 2017-05-04 08:53:05 +02:00
Omar Al-Safi
791545c5f4 DBZ-222 Added support for MySQL POINT type 2017-05-04 08:53:05 +02:00
Randall Hauch
bcaf1a88b3 DBZ-213 Corrected MongoDB connector build
Changed how the mongo-init process waits to begin by now looking for the second MongoDB server log message
saying it is ready, since the MongoDB image now has different startup behavior.
2017-04-04 11:13:25 -05:00
Randall Hauch
709cd8f3fe [maven-release-plugin] prepare for next development iteration 2017-03-27 11:28:12 -05:00
Randall Hauch
2bc3d45954 [maven-release-plugin] prepare release v0.5.0 2017-03-27 11:28:11 -05:00
Randall Hauch
7a72ed6ae6 Merge pull request #202 from don41382/upgrade-kafka-version-to-0.10.2.0
DBZ-203 Upgrade kafka version from 0.10.1.1 to 0.10.2.0
2017-03-17 17:21:48 -05:00
Randall Hauch
430d756062 [maven-release-plugin] prepare for next development iteration 2017-03-17 15:41:58 -05:00
Randall Hauch
536cbf6300 [maven-release-plugin] prepare release v0.4.1 2017-03-17 15:41:57 -05:00
Felix Eckhardt
a6a77a8f79 upgraded confluent platform from 3.1.2 to 3.2.0 2017-03-17 13:30:46 +11:00
Felix Eckhardt
5d414c521a upgraded kafka version from 0.10.1.1 to 0.10.2.0 2017-03-17 11:36:15 +11:00
Horia Chiorean
d2210a2a50 DBZ-158 Changes the version of the Postgres JDBC driver to 42.0.0 2017-02-24 09:09:04 +02:00
Randall Hauch
8c60c29883 [maven-release-plugin] prepare for next development iteration 2017-02-07 14:22:12 -06:00
Randall Hauch
20134286e9 [maven-release-plugin] prepare release v0.4.0 2017-02-07 14:22:11 -06:00
Randall Hauch
896dd35bcb DBZ-187 Upgrade MongoDB server and Java driver versions
Upgraded the MongoDB server to 3.2.12 and the Java driver to 3.4.2.
2017-02-07 12:49:50 -06:00
Randall Hauch
9ae50b3691 DBZ-186 Upgraded MySQL binary log client library
Upgraded Shyiko’s MySQL binary log client library from 0.8.0 to 0.9.0 to get new timeout behavior when it opens sockets and fix for JSON array processing.
2017-02-07 12:34:12 -06:00
Randall Hauch
65951308f7 DBZ-173 Upgraded Confluent Platform libraries
Some of our test cases verify (de)serialization using the Avro Converter, which is included in the Confluent Platform. This commit upgrades the Confluent Platform to version 3.1.2, which matches Kafka 0.10.1.1.
2017-02-07 11:18:21 -06:00
Randall Hauch
03130d45ef DBZ-151 DBZ-171 Removed the MySQL integration tests
Maintaining these integration tests has turned out to be a nightmare, so I'm removing them from the assembly build.
2017-02-02 11:51:03 -06:00
Randall Hauch
972cfbe2c4 DBZ-173 Additional fixes to KafkaDatabaseHistory class for Kafka 0.10.1.0
The KafkaDatabaseHistory class was not behaving well in tests using my local development environment. When restoring from the persisted Kafka topic, the class would set up a Kafka consumer and see repeated messages. It is unclear whether the repeats were due to our test environment and very short poll timeouts. Regardless, the restore logic was refactored to track offsets so as to only process messages once.
2017-02-01 14:47:41 -06:00
Horia Chiorean
7dfdef3558 DBZ-173 Upgrades the Kafka artifact versions to 0.10.1.1 2017-01-27 09:19:57 +02:00
Randall Hauch
e11f242b00 DBZ-179 Moved generated source for Protobuf
The project requires that all JavaDoc for public methods exist and are valid (e.g., have all @param, @return and @throws to match the signature). However, the generated Java source for Protobuf contain numerous JavaDoc errors relative to these settings. This causes lots of errors inside Eclipse (and probably other IDEs), but ignoring/disabling the JavaDoc errors leads to improper JavaDoc (fixed in next commit). By moving the generated Protobuf source code to a separate directory (e.g., `generated-sources`), the IDEs will automatically discover the directory and the user can ignore any compiler and JavaDoc errors/warnings for those files while keeping the more strict JavaDoc checking enabled for the rest of the code.
2017-01-20 11:50:08 -06:00
Randall Hauch
eeb4eafacf DBZ-3 Changed the PostgreSQL connector’s use of Docker containers
The PostgreSQL connetor was not able to build locally, since the Maven build would wait forever trying to talk to the TCP port for PostgreSQL before starting the integration tests. Even when I corrected the `wait` specification to use the localhost (rather than the direct container address), the build successfully connected to Postgres when it started the first time but before it shutdown to adjust the configuration, and thus the tests failed as the server was shutdown. The build now looks for a specific log message which is unique and output by the container after the second startup, and this seems to work great (at least locally).
2017-01-13 16:44:30 -06:00
Horia Chiorean
6159524618 DBZ-3 Removes the sources for the Postgres JDBC driver and replaces them with a SNAPSHOT release which contains the streaming support 2016-12-27 14:44:33 +02:00
Horia Chiorean
737614a555 DBZ-3 Implements a connector for streaming changes from a Postgres database
The version of the DB server required for this to work is at least 9.4. To be able to stream logical changes, the code relies on enhancements to the JDBC driver which are not yet public. Therefore, the current codebase includes the sources for the JDBC driver.
The commit also updates the general DBZ build system for:
* custom checkstyle package exclusions - required by the Postgres driver the protobuf code for now
* adds support for debugging Surefire and Failsafe
2016-12-27 14:44:32 +02:00
Horia Chiorean
23e3f59fa1 DBZ-3 Implements a connector for streaming changes from a Postgres database
The version of the DB server required for this to work is at least 9.4
The commit also updates the general DBZ build system for:
* custom checkstyle package exclusions - required by the Postgres driver the protobuf code for now
* adds support for debugging Surefire and Failsafe
2016-12-27 14:44:32 +02:00
Horia Chiorean
8e14f150db DBZ-3 Adds the structure for a Postgres connector which uses a Debezium Postgres docker image that has the decoderbufs plugin enabled to read WAL changes 2016-12-27 14:44:29 +02:00
Randall Hauch
49e6231b69 DBZ-151 Removed integration module from normal build 2016-12-21 17:11:15 -06:00
Randall Hauch
08e32a4a8b DBZ-151 Added multiple integration test modules to test various MySQL versions and configurations.
These new modules run during the '-Passembly' profile and use the new integration test framework that compares all
output produced by a connector to expected results that were previously recorded and verified. These integration test modules
can be run manually with a simple build of those modules or their parent; only the top-level 'integration-tests' module is run
during the assembly profile during builds of the entire codebase.
2016-12-20 09:18:10 -06:00
Randall Hauch
0bf3b4c9f3 DBZ-157 Upgraded Docker Maven plugin
Upgraded the Docker Maven plugin to 0.18.1, which required changing our use of the `docker.image` to `docker.filter` (per the [changes in 0.17.1](https://github.com/fabric8io/docker-maven-plugin/blob/master/doc/changelog.md)).
2016-11-22 09:23:07 -06:00
Randall Hauch
bfbf485123 Upgrade MySQL JDBC driver 2016-11-14 13:41:01 -06:00
Randall Hauch
4a62b09ead DBZ-126 Added support for MySQL JSON type
Adds support for MySQL 5.7's `JSON` type, which is capable of holding JSON objects, JSON arrays, and scalar values. The Debezium MySQL connector represents `JSON` values as string with a `io.debezium.data.Json` semantic type (which is basically a string schema that has a special name to denote the semantics), and the _contents_ of that string will be the JSON representation of the object, array, or scalar value.
2016-10-18 17:32:55 -05:00
Randall Hauch
0012125635 Upgraded the Maven Docker plugin 2016-10-18 17:10:06 -05:00
Randall Hauch
7387654bfa DBZ-129 Additional improvements for MySQL connector GTID-based startup
Added more integration tests to verify the behavior of the MySQL connector when it is (re)starting using GTIDs.
2016-10-18 14:30:10 -05:00
Randall Hauch
ce2b2db80c DBZ-99 Added support for MySQL connector to connect securely to MySQL
Changed the MySQL connector to have several new configuration properties for setting up the SSL key store and trust store (which can be used in place of System or JDK properties) used for MySQL secure connections, and another property to specify what kind of SSL connection be used.

Modified several integration tests to ensure all MySQL connections are made with `useSSL=false`.
2016-08-24 13:27:35 -05:00
Randall Hauch
e86fb83459 [maven-release-plugin] prepare for next development iteration 2016-08-16 09:56:47 -05:00
Randall Hauch
ccdb0a1a63 [maven-release-plugin] prepare release v0.3.0 2016-08-16 09:56:47 -05:00
Randall Hauch
b8fec14f7a Upgraded the Docker Maven Plugin 2016-08-15 13:02:37 -05:00
Randall Hauch
ed7d1ee8e6 Merge pull request #87 from rhauch/dbz-62
DBZ-62 Upgraded to Kafka 0.10.0.1 and Zookeeper 3.4.8
2016-08-15 12:39:11 -05:00
Randall Hauch
c2d210bbda DBZ-62 Upgraded to Kafka 0.10.0.1 and Zookeeper 3.4.8. 2016-08-11 12:31:59 -05:00
Randall Hauch
31641fb43e DBZ-91 Changed how temporal values are treated in MySQL connector
Rewrote how the MySQL connector converts temporal values to use schemas with names that identify the semantic
type of temporal value, and customized how the MySQL binlog client library creates Java object values from the
raw binlog events.

Several new "semantic" schema types were defined:

* `io.debezium.time.Year` represents a year number as an INT32 value (e.g., 2016, -345, etc.).
* `io.debezium.time.Date` represents a date by storing the epoch seconds (that is, the number of seconds past the epoch) as an INT64 value.
* `io.debezium.time.Time` represents a time by storing the milliseconds past midnight as an INT32 value.
* `io.debezium.time.MicroTime` represents a time by storing the microsconds past midnight as an INT32 value.
* `io.debezium.time.NanoTime` represents a time by storing the nanoseconds past midnight as an INT32 value.
* `io.debezium.time.Timestamp` represents a date and time (without timezone information) by storing the milliseconds past epoch as an INT64 value.
* `io.debezium.time.MicroTimestamp` represents a date and time (without timezone information) by storing the microseconds past epoch as an INT64 value.
* `io.debezium.time.NanoTimestamp` represents a date and time (without timezone information) by storing the nanoseconds past epoch as an INT64 value.
* `io.debezium.time.ZonedTime` represents a time with timezone and optional fractions of a second (but no date) by storing the ISO8601 form as a STRING value (e.g., `10:15:30+01:00`)
* `io.debezium.time.ZonedTimestamp` represents a date and time with timezone and optional fractions of a second by storing the ISO8601 form as a STRING value (e.g., `2011-12-03T10:15:30.030431+01:00`)

This range of semantic types allows for a far more accurate representation in the events of the temporal values stored within the database. The MySQL connector chooses the semantic type based upon the precision of the MySQL type (e.g., `TIMESTAMP(6)` will be represented with `io.debezium.time.MicroTimestamp`, whereas `TIMESTAMP(3)` will be represented with `io.debezium.time.Timestamp`). This ensures that the events do not lose precision and that the semantics of the database column values are retained in the events even though the values are represented with primitive values.

Obviously these Kafka Connect schema representations are different and more precise than the built-in `org.apache.kafka.connect.data.Date`, `org.apache.kafka.connect.data.Time`, and `org.apache.kafka.connect.data.Timestamp` logical types provided by Kafka Connect and used by the MySQL connector in all 0.2.x and 0.1.x versions. Migration to the new MySQL connector should be possible, although consumers may still need to know about these types to properly handle temporal values and the correct precision (i.e., consumers can just assume all date INT64 values represent milliseconds).

The MySQL binlog client library converted the raw binary event information to JDBC types using a local Calendar instance, which obviously incorporates the local timezone and cannot retain more than millisecond precision. This change extends the library's deserializers to instead use the Java 8 `javax.time` classes and to retain the exact semantics of the database values and to not lose any precisions (since the `javax.time` classes have nanosecond precision).

The same logic is also used to convert the JDBC values obtained during a snapshot from the MySQL Connect/J JDBC driver. The latter has a few quirks, such as not returning any fractional seconds for `TIME` columns, even though `java.sql.Time` can store up to milliseconds.

Most of the logic of the conversions of values and mapping to Kafka Connect schemas is handled in the new `JdbcValueConverters`, which was extracted from the existing `TableSchemaBuilder`. The MySQL connector reuses and actually extends the `JdbcValueConverters` class with its own `MySqlValueConverters` class that also adds support for MySQL-specific types such as `YEAR`. Other connectors whose values are based on JDBC types should be able to reuse and/or extend the `JdbcValueConverters` class.

Integration tests that deal with temporal types were modified to use proper expected values and comparisons.
2016-08-10 15:51:07 -05:00
Randall Hauch
8cb39eacf0 Reverted back to 0.3.0-SNAPSHOT, since the 0.3 candidate release was not acceptable. 2016-08-01 12:25:58 -05:00
Randall Hauch
517272278d [maven-release-plugin] prepare for next development iteration 2016-07-25 17:50:31 -05:00
Randall Hauch
b89296e646 [maven-release-plugin] prepare release v0.3.0 2016-07-25 17:50:31 -05:00
Randall Hauch
cb8904819c Upgraded Docker Maven Plugin to 0.15.12 2016-07-25 17:46:35 -05:00
Randall Hauch
447acb797d DBZ-62 Upgraded to Kafka and Kafka Connect 0.10.0.0
Upgraded from Kafka 0.9.0.1 to Kafka 0.10.0. The only required change was to override the `Connector.config()` method, which returns `null` or a `ConfigDef` instance that contains detailed metadata for each of the configuration fields, including supporting recommended values and marking fields as not visible (e.g., if they don't make sense given other configuration field values). This can be used by user interfaces to data-drive the configuration of a connector. Also, the default validation logic of the Connector implementations uses a `Validator` that is pretty restrictive in its functionality.

Debezium already had a fairly decent and simple `Configuration` framework. After several attempts to try and merge these concepts, reconciling the two validation mechanisms was very complicated and involved a lot of changes. It was easier to simply continue Debezium-specific validation and to override the `Connector.validate(...)` method to use Debezium's `Configuration`-based validation. Connector-based validation logic includes determining recommended values, so Debezium's `Field` class (used to define each configuration property) was enhanced with a new `Recommender` class that is similar to Kafka's.

Additional integration tests were added to verify that the `ConfigDef` result is acceptable and that the new connector validation logic works as expected, including getting recommended values for some fields (e.g., database names, table/collection names) from MySQL and MongoDB by connecting and dynamically reading the values. This was done in a way that remains backward compatible with the regular expression formats of these fields, but in a user interface that uses the `ConfigDef` mechanism the user can simply select the databases and table/collection identifiers.
2016-07-25 14:21:31 -05:00
Randall Hauch
30777e3345 DBZ-85 Added test case and made correction to temporal values
Added an integration test case to diagnose the loss of the fractional seconds from MySQL temporal values. The problem appears to be a bug in the MySQL Binary Log Connector library that we used, and this bug was reported as https://github.com/shyiko/mysql-binlog-connector-java/issues/103. That was fixed in version 0.3.2 of the library, which Stanley was kind enough to release for us.

During testing, though, several issues were discovered in how temporal values are handled and converted from the MySQL events, through the MySQL Binary Log client library, and through the Debezium MySQL connector to conform with Kafka Connect's various temporal logical schema types. Most of the issues involved converting most of the temporal values from local time zone (which is how they are created by the MySQL Binary Log client) into UTC (which is how Kafka Connect expects them). Really, java.util.Date doesn't have time zone information and instead tracks the number of milliseconds past epoch, but the conversion of normal timestamp information to the milliseconds past epoch in UTC depends on the time zone in which that conversion happens.
2016-07-20 17:07:56 -05:00
Randall Hauch
12e7cfb8d3 DBZ-2 Created initial Maven module with a MongoDB connector
Added a new `debezium-connector-mongodb` module that defines a MongoDB connector. The MongoDB connector can capture and record the changes within a MongoDB replica set, or when seeded with addresses of the configuration server of a MongoDB sharded cluster, the connector captures the changes from the each replica set used as a shard. In the latter case, the connector even discovers the addition of or removal of shards.

The connector monitors each replica set using multiple tasks and, if needed, separate threads within each task. When a replica set is being monitored for the first time, the connector will perform an "initial sync" of that replica set's databases and collections. Once the initial sync has completed, the connector will then begin tailing the oplog of the replica set, starting at the exact point in time at which it started the initial sync. This equivalent to how MongoDB replication works.

The connector always uses the replica set's primary node to tail the oplog. If the replica set undergoes an election and different node becomes primary, the connector will immediately stop tailing the oplog, connect to the new primary, and start tailing the oplog using the new primary node. Likewise, if connector experiences any problems communicating with the replica set members, it will try to reconnect (using exponential backoff so as to not overwhelm the replica set) and continue tailing the oplog from where it last left off. In this way the connector is able to dynamically adjust to changes in replica set membership and to automatically handle communication failures.

The MongoDB oplog contains limited information, and in particular the events describing updates and deletes do not actually have the before or after state of the documents. Instead, the oplog events are all idempotent, so updates contain the effective changes that were made during an update, and deletes merely contain the deleted document identifier. Consequently, the connector is limited in the information it includes in its output events. Create and read events do contain the initial state, but the update contain only the changes (rather than the before and/or after states of the document) and delete events do not have the before state of the deleted document. All connector events, however, do contain the local system timestamp at which the event was processed and _source_ information detailing the origins of the event, including the replica set name, the MongoDB transaction timestamp of the event, and the transactions identifier among other things.

It is possible for MongoDB to lose commits in specific failure situations. For exmaple, if the primary applies a change and records it in its oplog before it then crashes unexpectedly, the secondary nodes may not have had a chance to read those changes from the primary's oplog before the primary crashed. If one such secondary is then elected as primary, it's oplog is missing the last changes that the old primary had recorded and no longer has those changes. In these cases where MongoDB loses changes recorded in a primary's oplog, it is possible that the MongoDB connector may or may not capture these lost changes.
2016-07-14 13:02:36 -05:00
Randall Hauch
0f3ed9f50f DBZ-71 Corrected MySQL connector plugin archives and upgraded MySQL JDBC driver from 5.1.38 to 5.1.39 (the latest) 2016-06-09 21:15:34 -05:00
Randall Hauch
6749518f66 [maven-release-plugin] prepare for next development iteration 2016-06-08 13:00:50 -05:00
Randall Hauch
d5bbb116ed [maven-release-plugin] prepare release v0.2.0 2016-06-08 13:00:50 -05:00
Randall Hauch
096ea24000 DBZ-37 Upgraded the BinlogClient library from 0.2.4 to 0.3.1, which is the latest 2016-06-02 17:08:46 -05:00
Randall Hauch
264a9041df DBZ-64 Added Avro Converter to record verification utilities
The `VerifyRecord` utility class has methods that will verify a `SourceRecord`, and is used in many of our integration tests to check whether records are constructed in a valid manner. The utility already checks whether the records can be serialized and deserialized using the JSON converter (provided with Kafka Connect); this change also checks with the Avro Converter (which produces much smaller records and is more suitable for production).

Note that version 3.0.0 of the Confluent Avro Converter is required; version 2.1.0-alpha1 could not properly handle complex Schema objects with optional fields (see https://github.com/confluentinc/schema-registry/pull/280).

Also, the names of the Kafka Connect schemas used in MySQL source records has changed.

# The record's envelope Schema used to be "<serverName>.<database>.<table>" but is now "<serverName>.<database>.<table>.Envelope".
# The Schema for record keys used to be named "<database>.<table>/pk", but the '/' character is not valid within a Avro name, and has been changed to "<serverName>.<database>.<table>.Key".
# The Schema for record values used to be named "<database>.<table>", but to better fit with the other Schema names it has been changed to "<serverName>.<database>.<table>.Value".

Thus, all of the Schemas for a single database table have the same Avro namespace "<serverName>.<database>.<table>" (or "<topicName>") with Avro schema names of "Envelope", "Key", and "Value".

All unit and integration tests pass.
2016-06-02 16:54:21 -05:00
David Chen
339f03859c DBZ-63 Fix POM dependency management.
Thanks for the reminding from https://issues.jboss.org/browse/DBZ-63\?focusedCommentId\=13242595\&page\=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel\#comment-13242595
2016-05-25 15:21:45 +01:00
Randall Hauch
8f5487b2c0 [maven-release-plugin] prepare for next development iteration 2016-03-17 16:28:40 -05:00
Randall Hauch
c2b8ac50ae [maven-release-plugin] prepare release v0.1.0 2016-03-17 16:28:40 -05:00
Randall Hauch
eea175a5aa DBZ-32 Corrected assembly plugin descriptor in parent POM 2016-03-17 16:04:53 -05:00
Randall Hauch
b5945a24ec DBZ-32 Corrected assembly dependencies 2016-03-17 15:58:27 -05:00
Randall Hauch
0867bd7961 DBZ-32 Changed Maven build to support releasing to Maven Central via the Sonatype OSSRH. 2016-03-17 15:16:31 -05:00
Randall Hauch
91d200df51 DBZ-15 Removed some of the unnecessary JARs from the MySQL connector plugin kit 2016-03-17 11:03:27 -05:00
Randall Hauch
046fc83850 DBZ-23 Simplified PosgreSQL Connector's use of Docker plugin 2016-02-25 10:24:52 -06:00
Randall Hauch
42e531dbe9 DBZ-23 Simplified MySQL Connector's use of Docker plugin 2016-02-25 10:24:39 -06:00
Randall Hauch
7d4a996406 DBZ-23 Docker image created by the module no longer is tagged 2016-02-25 09:43:11 -06:00
Randall Hauch
73da199a4d DBZ-22 Adapted to the Docker Maven Plugin's move to Fabric8 community 2016-02-25 08:59:24 -06:00
Randall Hauch
92949d31c0 DBZ-21 Upgraded to Kafka 0.9.0.1 2016-02-23 15:26:02 -06:00
Randall Hauch
50e28d72a6 DBZ-17 Added plugin distribution ZIP that can be used for other Kafka Connector plugin modules 2016-02-23 13:23:36 -06:00
Randall Hauch
1d46e59048 DBZ-17 Minor changes to the POMs 2016-02-18 13:58:29 -06:00
Randall Hauch
0102f620a9 DBZ-13 Changed Maven build to attach JavaDoc JARs to each module
Modified the 'docs' profile to build and attach JavaDoc JARs for each module's source and test source artifacts. The profile will be automatically used when releasing.
2016-02-17 11:14:50 -06:00
Randall Hauch
dab0440612 DBZ-14 Corrected the 'alt-mysql' Maven profile so that it can be used with any of the other Maven commands. 2016-02-16 16:37:30 -06:00
Christian Posta
c730685a01 add option to run without integration tests 2016-02-15 16:26:32 -07:00
Randall Hauch
73f3c9836b DBZ-1 Completed integration testing and debugging of the MySQL connector 2016-02-15 14:46:12 -06:00
Randall Hauch
1a59f9b07c DBZ-11 Build can skip long-running unit and integration tests 2016-02-04 15:35:27 -06:00
Randall Hauch
54b822bb72 DBZ-10 Added small utility so unit tests can run an embedded Kafka cluster within the same process.
This utility is only suitable for unit tests and therefore is defined in the test JAR of the `debezium-core` module. It certainly should never be used for production purposes.
2016-02-04 15:18:27 -06:00
Randall Hauch
37d6a5e7da DBZ-1 Expanded documentation and improved EmbeddedConnector framework
Changed the EmbeddedConnector framework to initialize all major components via configuration properties rather than through the public builder. This increases the size of the configurations, but it simplifies what embedding applications must do to obtain an EmbeddedConnector instance.

The DatabaseHistory framework was also changed to be configurable in similar ways to the OffsetBackingStore. Essentially, connectors that want to use it (like the MySqlConnector) will describe it as part of the connector's configuration, allowing more flexibility in which DatabaseHistory implementation is used and how it is configured whether in Kafka Connector or as part of the EmbeddedConnector.

Added a README.md to `debezium-embedded` to provide documentation and sample code showing how to use the EmbeddedConnector.
2016-02-03 14:11:53 -06:00
Randall Hauch
0e58dba9d6 DBZ-1 Renamed the connector modules and packages 2016-02-02 16:58:48 -06:00
Randall Hauch
2da5b37f76 DBZ-1 Added support for recording and recovering database schema
Adds a small framework for recording the DDL operations on the schema state (e.g., Tables) as they are read and applied from the log, and when restarting the connector task to recover the accumulated schema state. Where and how the DDL operations are recorded is an abstraction called `DatabaseHistory`, with three options: in-memory (primarily for testing purposes), file-based (for embedded cases and perhaps standalone Kafka Connect uses), and Kafka (for normal Kafka Connect deployments).

The `DatabaseHistory` interface methods take several parameters that are used to construct a `SourceRecord`. The `SourceRecord` type was not used, however, since that would result in this interface (and potential extension mechanism) having a dependency on and exposing the Kafka API. Instead, the more general parameters are used to keep the API simple.

The `FileDatabaseHistory` and `MemoryDatabaseHistory` implementations are both fairly simple, but the `FileDatabaseHistory` relies upon representing each recorded change as a JSON document. This is simple, is easily written to files, allows for recovery of data from the raw file, etc. Although this was done initially using Jackson, the code to read and write the JSON documents required a lot of boilerplate. Instead, the `Document` framework developed during Debezium's very early prototype stages was brought back. It provides a very usable API for working with documents, including the ability to compare documents semantically (e.g., numeric values are converted to be able to compare their numeric values rather than just compare representations) and with or without field order.

The `KafkaDatabaseHistory` is a bit more complicated, since it uses a Kafka broker to record all database schema changes on a single topic with single partition, and then upon restart uses it to recover the history from the dedicated topics. This implementation also records the changes as JSON documents, keeping it simple and independent of the Kafka Connect converters.
2016-02-02 14:27:14 -06:00
Randall Hauch
4ddd4b33be Changed Docker usage on Travis-CI 2016-01-25 16:12:07 -06:00
Randall Hauch
8e6c615644 Added utilities for managing a relational schema's table definitions, with support for updating those by reading DDL 2016-01-20 08:53:29 -06:00
Randall Hauch
dffdfd8049 Added debezium-core and MySQL binary log reading tests. 2015-11-24 15:54:37 -06:00
Randall Hauch
0a99ed67cd Initial project skeleton
This initial commit defines several modules for ingesting from JDBC and specifically from PostgreSQL and MySQL. The two latter modules define separate unit tests and integration tests, and prior to running the integration tests create a Docker image with the respective database and start a Docker container. Any *.sql or *.sh files are run on database, allowing the modules to easily create and populate databases used in the tests. The integration tests are then run (using the failsafe maven plugin), and regardless of whether there are any failures the Docker container is always shutdown (at least when running `mvn install`). See the modules' README files for details.
2015-11-18 14:23:29 -06:00