tet123

Author	SHA1	Message	Date
jchipmunk	7c0ae3ee20	DBZ-1212 SLF4J usage issues	2019-04-04 21:32:12 +02:00
Joy Gao	f3220e9766	DBZ-1207 MySQL connection timeout after bootstrapping a new table	2019-04-02 12:52:39 +02:00
Jiri Pechanec	b90274cc63	DBZ-1194 Convert nulls even for key	2019-03-29 21:20:53 +01:00
ShubhamRwt	540a951211	DBZ- 362 Adding WhitespaceAfter check to Checkstyle	2019-03-28 09:24:11 +01:00
Jenkins user	8b0edd1871	[maven-release-plugin] prepare for next development iteration	2019-03-25 12:16:31 +00:00
Jenkins user	56f98a48b2	[maven-release-plugin] prepare release v0.9.3.Final	2019-03-25 12:16:30 +00:00
Jiri Pechanec	7524b1ec89	DBZ-1169 Mark outbox as incubating	2019-03-25 11:27:44 +01:00
Renato Mefi	0847870d21	DBZ-1169 Test for invalid Outbox operations Also removed unused code	2019-03-25 11:27:44 +01:00
Renato Mefi	be9c70db64	DBZ-1169 Ensure SMT configuration is valid Also had to update a validator	2019-03-25 11:27:44 +01:00
Renato Mefi	87a3278dfe	DBZ-1169 Support user set Schema version The cache mechanism had to be adapted in order to support non-versioned and versioned schemas, a test now confirms it's the same valueSchema instance created once.	2019-03-25 11:27:44 +01:00
Renato Mefi	4aec4d5e92	DBZ-1169 Schema should inherit from original type To allow more flexibility we should proxy the new Event Schema fields based on the original table columns which Debezium is capturing. The Schema is now built one time during the first Record in order to detect those types which are only available within the Record.	2019-03-25 11:27:44 +01:00
Renato Mefi	c14ce54a6f	DBZ-1169 Support additional user fields	2019-03-25 11:27:44 +01:00
Renato Mefi	2a8806e15c	DBZ-1169 Support setting Record timestamp It's a good practice to depend on the Kafka metadata instead of custom dates in the payload, this way for instance when using KStreams with Tumbling Window the dates are correctly matched.	2019-03-25 11:27:44 +01:00
Renato Mefi	8b689d7f03	DBZ-1169 Implement RegexRouter based on config	2019-03-25 11:27:44 +01:00
Renato Mefi	f0f13ed4a4	DBZ-1169 Implement basic behavior	2019-03-25 11:27:44 +01:00
Renato Mefi	ebf36e1ddb	DBZ-1169 Support skipping deletions and tombstones	2019-03-25 11:27:44 +01:00
Renato Mefi	6512c0b387	DBZ-1169 Add Configuration for Outbox Transform	2019-03-25 11:27:44 +01:00
Jiri Pechanec	9b405aa84d	DBZ-1039 Add missing spaces	2019-03-25 10:23:14 +01:00
shubham	8cca21e969	DBZ-1039 Make all ifs with braces	2019-03-25 10:23:14 +01:00
Addison Higham	8e21139ca3	DBZ-1082 Add new custom recovery mode, more metadata This commit does a few things: - Refactors snapshot modes to be encapsulated by an interface and to only use that interface in determining when to snapshot and in determing the type of the `RecordProducer` interface to instantiate - Refactors the configuration of existing snapshot modes to tie the existing snapshot modes to their aligned implementation - Adds a new snapshot.mode, custom, and a new configuration option to specify a custom implementation that will be loaded by the class loader - Changes the visibility of some classes to allow for custom snapshot modes to get enough context to make an informed choice - Adds some metadata about slots (the catalog_xmin) to give a full idea of the state of slots which can be useful in implementing snapshot modes (which is also configurable, as it can add some overhead) Together, these changes allow for a much broader flexibility got end users to implement a snapshot mode that can do more advanced snapshots, such as partial recovery or for partial snapshots for tables where not all records are needed. This could also be seen as superseeding the `snapshot.select.statement.overrides` to allow for users to dynamically build queries based on the state of the slot and the offsets consumed.	2019-03-24 11:56:40 +01:00
Jon Casstevens	2efdae1ffd	DBZ-1179 Set the default replication factor property for history topics when none is provided	2019-03-21 12:07:10 +01:00
Gunnar Morling	2ea87727cf	DBZ-1174 Using Instant instead of long for representing timestamps	2019-03-12 17:19:13 +01:00
Jiri Pechanec	71765a2ab2	DBZ-1170 Detect and handle errors like ULE	2019-03-11 10:29:54 +01:00
Josh Stanfield	e4b7b90637	DBZ-1162 fix to ensure hstore snapshots behave the same as streams	2019-03-04 13:39:50 +01:00
Jenkins user	50a7c568fa	[maven-release-plugin] prepare for next development iteration	2019-02-23 09:53:58 +00:00
Jenkins user	28f3839804	[maven-release-plugin] prepare release v0.9.2.Final	2019-02-23 09:53:58 +00:00
Gunnar Morling	6787d17d54	DBZ-1141 Misc. clean-up; * Removing redundant check for date mapping type * Always using String as fallback value for temporal values where needed * Pulling fallback temporal values up to JdbcValueConverters	2019-02-21 15:02:51 +01:00
Jiri Pechanec	41ba7128e3	DBZ-1141 Correct default falback values for connect date types	2019-02-21 15:02:51 +01:00
Keith Barber	95b2290df2	DBZ-1149 Ensure consistent Java types for fallback values when converting	2019-02-21 15:02:51 +01:00
Jiri Pechanec	ff5aaf2e9a	DBZ-1153 Support for special char names	2019-02-20 15:25:45 +01:00
Gunnar Morling	510e9415f9	DBZ-1137 Some clean-up; * Using existing justifyLeft() method * Giving expected values as Duration * Removing unused methods	2019-02-19 18:41:09 +01:00
Jiri Pechanec	0d60e60243	DBZ-1137 Default for TIME can be generic interval value	2019-02-19 13:40:01 +01:00
Jenkins user	b96e5fee0d	[maven-release-plugin] prepare for next development iteration	2019-02-13 11:13:44 +00:00
Jenkins user	3de2d04bcd	[maven-release-plugin] prepare release v0.9.1.Final	2019-02-13 11:13:44 +00:00
Jiri Pechanec	9049078607	DBZ-997 Handle de-sync of message and JDBC PK info	2019-02-13 10:13:12 +01:00
tomaz.fernandes	4e6b44d925	DBZ-1122 Fixed a potential NPE in FileDatabaseHistory	2019-02-12 13:17:09 +01:00
Gunnar Morling	2a486ecfff	DBZ-1019 Removing legacy methods from Metronome	2019-02-11 18:11:34 +01:00
Jiri Pechanec	fa6fcd8487	DBZ-1019 Guard against wall-clock change	2019-02-11 18:11:34 +01:00
Jiri Pechanec	4d76243227	DBZ-1121 Emit ALTER event for updated tables	2019-02-09 20:03:57 +01:00
Gunnar Morling	7076deba5a	DBZ-1118 Adding JUnit rule to skip tests depending on Postgres version	2019-02-04 14:43:19 +01:00
Jiri Pechanec	16620f0bf9	DBZ-1040 Cleaning up event formatting	2019-02-04 12:39:07 +01:00
Gunnar Morling	99caad60d8	DBZ-1040 Changing resolution of "lag behind source" to milli-seconds; Also using Duration/Instant types internally.	2019-02-04 12:39:07 +01:00
Gunnar Morling	f1c4234248	DBZ-1040 Removing value from metrics events	2019-02-04 12:39:07 +01:00
Gunnar Morling	913e0bc5c8	DBZ-1040 Making metadata provider a member of Metrics instead of passing it for each onEvent() call	2019-02-04 12:39:07 +01:00
Gunnar Morling	399f91ce84	DBZ-1040 Typo fix	2019-02-04 12:39:07 +01:00
Gunnar Morling	27274d664f	DBZ-1040 Passing source as DataCollectionId instead of Object	2019-02-04 12:39:07 +01:00
Jiri Pechanec	de171917dc	DBZ-1040 Improve event to string formatting	2019-02-04 12:39:07 +01:00
Jiri Pechanec	a621e43d8b	DBZ-1040 Expose transaction id	2019-02-04 12:39:07 +01:00
Jiri Pechanec	213d4a42ca	DBZ-1040 Metrics for internal queue	2019-02-04 12:39:07 +01:00
Jiri Pechanec	d993ee265e	DBZ-1040 Metrics for lag behind source, offset and tx count	2019-02-04 12:39:07 +01:00
jchipmunk	acbc94e8b0	DBZ-1112 rename prepareQuery() to prepareQueryWithBlockingConsumer() Reason: reference to prepareQuery is ambiguous [ERROR] both method prepareQuery(java.lang.String,io.debezium.jdbc.JdbcConnection.StatementPreparer,io.debezium.jdbc.JdbcConnection.BlockingResultSetConsumer) in io.debezium.jdbc.JdbcConnection and method prepareQuery(java.lang.String,io.debezium.jdbc.JdbcConnection.StatementPreparer,io.debezium.jdbc.JdbcConnection.ResultSetConsumer) in io.debezium.jdbc.JdbcConnection match	2019-01-30 09:36:09 +01:00
jchipmunk	ff4e38cc46	DBZ-1112 Strings.join() doesn't apply conversation for first element Replace: - Integer.parseInt() to Integer.valueOf() - Short.parseShort() and new Short() to Short.valueOf() - Long.parseLong() to Long.valueOf()	2019-01-30 09:36:09 +01:00
Gunnar Morling	2a80931d03	DBZ-962 Avoiding Short constructor	2019-01-29 10:28:48 +01:00
Gunnar Morling	e84c6e2238	DBZ-962 Avoiding dependency to JAXB	2019-01-29 10:28:48 +01:00
Jenkins user	c5b7d21d3e	[maven-release-plugin] prepare for next development iteration	2019-01-28 11:07:04 +00:00
Jenkins user	f1ae79ff73	[maven-release-plugin] prepare release v0.9.0.CR1	2019-01-28 11:07:04 +00:00
Gunnar Morling	ccff79e07d	DBZ-1105 Removing TableImpl#columnNames field	2019-01-28 10:23:56 +01:00
Gunnar Morling	a27c29765a	DBZ-947 Formatting	2019-01-28 09:32:57 +01:00
Grzegorz Kołakowski	64c45c7490	DBZ-947 Reset initial transaction isolation level at snapshot end	2019-01-28 09:32:57 +01:00
Gunnar Morling	ebe5fd7e85	DBZ-1067 Adding SourceRecordAssert	2019-01-23 08:31:22 +01:00
Gunnar Morling	9665af3ebc	DBZ-1086 Adding Ivan Kovbas to COPYRIGHT.txt	2019-01-18 21:43:45 +01:00
Ivan Kovbas	81f031cb7c	DBZ-1086 Added functionality to gracefully handle key-less messages in ByLogicalTableRouter transformation.	2019-01-18 21:38:52 +01:00
Gunnar Morling	cedef89a48	DBZ-1089 Using Strings#isNullOrEmpty(); adding test	2019-01-18 10:07:20 +01:00
Jiri Pechanec	28c9734685	DBZ-1089 Use only schema name when both catalog and schema is present	2019-01-18 10:07:20 +01:00
Jiri Pechanec	25156730f0	DBZ-1088 use correct key for update of PK insert event	2019-01-17 14:28:18 +01:00
Jiri Pechanec	ceb6c62721	DBZ-1064 Remove lazy fallback values	2019-01-16 13:01:36 +01:00
Jiri Pechanec	536e517c08	DBZ-1064 Geometry default value changes; fixes for default suppliers	2019-01-16 13:01:36 +01:00
Gunnar Morling	532c17f8aa	DBZ-1064 Using dedicated functional interface instead of Consumer; * Commenting * Moving type from "util" to "jdbc"	2019-01-16 13:01:36 +01:00
Jiri Pechanec	bff7eea734	DBZ-1064 Use convert value template	2019-01-16 13:01:36 +01:00
Jiri Pechanec	920e0cbd3f	DBZ-1064 Set default value only for non-nullable fields	2019-01-16 13:01:36 +01:00
Gunnar Morling	de356896c0	DBZ-1029 Adding one more test	2019-01-15 17:41:42 +01:00
Amit Sela	b0b7d942a3	DBZ-1073 Adding scale as schema parameter, if present	2019-01-11 15:40:30 +01:00
Grzegorz Kołakowski	91c72738c0	DBZ-1067 Add column blacklist field to RelationalDatabaseConnectorConfig	2019-01-09 18:13:26 +01:00
Jenkins user	5275f73424	[maven-release-plugin] prepare for next development iteration	2018-12-19 13:06:12 +00:00
Jenkins user	b6569c18ae	[maven-release-plugin] prepare release v0.9.0.Beta2	2018-12-19 13:06:12 +00:00
Jiri Pechanec	0e98bb428b	DBZ-978 Make offset readable in last event	2018-12-19 11:03:01 +01:00
luobo	fcfc019b6b	fix the wrong words	2018-12-14 13:05:14 +01:00
Gunnar Morling	47f22feddc	DBZ-978 Misc. clean-up	2018-12-13 21:17:23 +01:00
Jiri Pechanec	edec1c3090	DBZ-978 Added last event and captured tables metrics	2018-12-13 20:52:11 +01:00
Gunnar Morling	f89fc0bb14	DBZ-966 Awaiting snapshot delay only when actually doing a snapshot	2018-12-13 14:29:54 +01:00
Gunnar Morling	5542c0de29	DBZ-966 Making sure Postgres connector can be stopped while awaiting initial snapshot delay; * Also skipping snapshotting when requested to stop during initial delay * Showing remaining delay time in logs * Adding missing config	2018-12-13 14:29:54 +01:00
Gunnar Morling	7a577161da	DBZ-966 Pushing snapshot delay logic into coordinator; Using timer + metronome to ensure the delay phase will be cancelled upon connector shutdown	2018-12-13 14:29:54 +01:00
Grzegorz Kołakowski	7f8c0e8ff1	DBZ-966 Enable to delay initial snapshot in HistorizedRelationalSnapshotChangeEventSource	2018-12-13 14:29:54 +01:00
Grzegorz Kołakowski	2a1934eeb1	DBZ-966 Make snapshot.delay.ms global property	2018-12-13 14:29:54 +01:00
Jiri Pechanec	2b2ff0754d	DBZ-960 Run SnapshotReader in a single transaction	2018-12-11 14:15:53 +01:00
Jiri Pechanec	284ae98adf	DBZ-1003 Initialize database history	2018-12-03 21:52:05 +01:00
Renato Mefi	8c1ebc2a8f	DBZ-971 Unwrap can add data operation header	2018-11-30 12:06:18 +01:00
Renato Mefi	0af8dcfde0	DBZ-971 Ensure Envelope unwrap propagates headers	2018-11-30 12:06:18 +01:00
Jenkins user	db1d3a7fb8	[maven-release-plugin] prepare for next development iteration	2018-11-20 16:15:14 +00:00
Jenkins user	f83db82cea	[maven-release-plugin] prepare release v0.9.0.Beta1	2018-11-20 16:15:14 +00:00
Gunnar Morling	6bd4024618	DBZ-776 Misc. improvements; * Renaming getTimeSinceLastEvent() to getMilliSecondsSinceLastEvent() * Further unifying metrics implementation across connectors * Emitting event in EventDispatcher also if event is filtered out * Typo fixes	2018-11-16 08:52:07 +01:00
Jiri Pechanec	4510e2bbe6	DBZ-776 Cleanup after review	2018-11-16 08:52:07 +01:00
Gunnar Morling	360714a5a1	DBZ-776 Formatting and minor clean-up	2018-11-16 08:52:07 +01:00
Jiri Pechanec	b44c9fd04f	DBZ-776 A first set of metrics implemented in new framework	2018-11-16 08:52:07 +01:00
Jiri Pechanec	4cdc3e669b	DBZ-776 Extract metrics to a common code	2018-11-16 08:52:07 +01:00
Gunnar Morling	9436778a1d	DBZ-812 Misc. clean-up	2018-11-13 10:06:22 +01:00
Jiri Pechanec	33f6529614	DBZ-812 PK list is not-null for no PKs on table	2018-11-13 09:41:03 +01:00
Jiri Pechanec	5930c580f7	DBZ-812 Change extract column metadata	2018-11-13 09:41:03 +01:00
Jiri Pechanec	839d7c80eb	DBZ-812 Prep statements with different parameters; Extract table schema code	2018-11-13 09:41:03 +01:00
Gunnar Morling	92704ce394	DBZ-963 Removing hard-coded references to "target" dir; Also injecting actual build dir via system variable. Making DATA_DIR a simple string.	2018-11-08 10:35:07 +01:00
Anton Martynov	3401cd23ea	DBZ-963 Allowing to customize the directory where tests are storing their data	2018-11-08 10:34:10 +01:00
Gunnar Morling	9994172860	DBZ-953 Pulling decimal mode handling up to RelationalDatabaseConnectorConfig; Also using it in MySQL connector.	2018-11-08 09:23:28 +01:00
Grzegorz Kołakowski	2d5295805c	DBZ-941 Add connectionCreated template method in HistorizedRelationalSnapshotChangeEventSource The method allows to define steps which have to be taken just after the database connection is created (e.g. setting snapshot isolation level). By default no operation is executed.	2018-10-30 13:35:57 +01:00
Jenkins user	d952f5dfb0	[maven-release-plugin] prepare for next development iteration	2018-10-04 11:59:14 +00:00
Jenkins user	ff9b70278b	[maven-release-plugin] prepare release v0.9.0.Alpha2	2018-10-04 11:59:13 +00:00
jchipmunk	05ba99bb0a	DBZ-918 Adding Debezium connector field to source info This will allow consumers to recognize the Debezium connector used for creating a given message, helping them to adjust their behavior for a variety of connectors.	2018-10-04 09:00:13 +02:00
Gunnar Morling	39a47fabfa	DBZ-899 Using custom statement to obtain SQL type info in order to avoid N+1 issues with PgDatabaseMetaData#getTypeInfo()	2018-09-19 09:51:38 +02:00
Gunnar Morling	3331d4c1a9	DBZ-899 Removing unused field; Also inlining method only used in one place.	2018-09-19 09:51:38 +02:00
Jiri Pechanec	2253eaa6a6	DBZ-819 JdbcConnection is not thread safe	2018-09-18 09:05:36 +02:00
Jiri Pechanec	05ce2e3063	DBZ-819 Cache prepared statements in connection	2018-09-18 09:05:36 +02:00
jchipmunk	b0feaf496a	DBZ-633 Adding "field.blacklist" property for MongoDB connector The "field.blacklist" configuration property is an optional comma-separated list of the fully-qualified names of fields that should be excluded from change event message values. Fully-qualified names for fields are of the form "databaseName.collectionName.fieldName.nestedFieldName", where "databaseName" and "collectionName" may contain the wildcard (*) which matches any characters. Although the "field.blacklist" configuration property allows you to remove fields from the event values, the "_id" field is always included in the event’s key.	2018-09-14 12:54:11 +02:00
Gunnar Morling	5878e5105f	DBZ-878 Using LRU eviction strategy [As recommended](`4961277637 (commitcomment-30412629)`) by Ben Manes, the LIRS implementation seems to provide no advantage and isn't well tested.	2018-09-06 11:21:31 +02:00
Gunnar Morling	4961277637	DBZ-878 Making topic name cache size bound; * Imported BoundedConcurrentHashMap from Hibernate ORM as a thread-safe concurrent cache.	2018-09-05 12:56:53 +02:00
Gunnar Morling	e1bbbd7992	DBZ-865 DBZ-878 Replacing invalid topic name characters with "_";	2018-09-05 12:56:53 +02:00
Gunnar Morling	eba134817f	DBZ-894 Making TableIdToStringMapper dedicated functional interface with specifically named method	2018-09-04 12:43:50 +02:00
Jiri Pechanec	1373ac8475	DBZ-894 Introduce explicit mapping functional type	2018-09-04 12:39:10 +02:00
Jiri Pechanec	3e0648a42c	DBZ-894 Customize TableId comparison per-connector in filters	2018-09-04 09:18:15 +02:00
Gunnar Morling	985999b4af	DBZ-872 Making MySqlAntlrDdlParser#tableFilter final	2018-08-24 08:12:54 +02:00
Jiri Pechanec	63ee65bb7d	DBZ-872 Honor filters in ALTER TABLE parsing	2018-08-24 08:12:54 +02:00
Jiri Pechanec	82a93cdac5	DBZ-859 Heartbeat is sent after pipeline snapshot	2018-08-23 11:44:53 +02:00
Jiri Pechanec	985e03e7b5	DBZ-859 Force heartbeat after snapshot completion	2018-08-23 11:44:53 +02:00
Maciej Bryński	947abf9815	DBZ-857 Ability to rewrites deleted records	2018-08-14 10:37:39 +02:00
artiship	57fa31e1bc	DBZ-854 Correct param name for excludeColumns(String fullyQualifiedTableNames)	2018-08-14 08:39:28 +02:00
LiuHanlin	41e16f4a4f	DBZ-853 Fix kafka database history storage miscount attemp number even if there are more records to consume	2018-08-13 20:26:48 +02:00
Jenkins user	e00dad127f	[maven-release-plugin] prepare for next development iteration	2018-07-26 08:00:12 +00:00
Jenkins user	16bfd5c700	[maven-release-plugin] prepare release v0.9.0.Alpha1	2018-07-26 08:00:12 +00:00
Gunnar Morling	3061c1dc0b	DBZ-815 Passing offset to Heartbeat#heartbeat(); That's simpler to grasp than the approach of passing a supplier lambda to the constructor. Also it allows to pass on the offset via local variables instead of instance fields in some cases.	2018-07-25 14:49:52 +02:00
Gunnar Morling	0a659c0159	DBZ-815 Ensuring latest offset is propagated also if change events relate to non-whitelisted tables	2018-07-25 14:49:52 +02:00
Gunnar Morling	fa68c1a158	DBZ-816 Preparing code for database history set-up for re-use	2018-07-25 11:54:15 +02:00
Jiri Pechanec	e86ddc0ab2	DBZ-40 Add blocking version of MultiResultSetConsumer	2018-07-25 11:51:09 +02:00
Gunnar Morling	bd4f3d60fa	DBZ-804 Simplifying construction of VariableScaleDecimal	2018-07-24 20:48:52 +02:00
Gunnar Morling	98c329aabe	DBZ-40 Adding SnapshottingTask#toString()	2018-07-24 18:53:32 +02:00
Gunnar Morling	0362b333bf	DBZ-804 Better schema comparison in tests; preparing byte conversion	2018-07-24 05:18:52 +02:00
Gunnar Morling	9814a5c82c	DBZ-803 Finalizing snapshot in offsets also for schema-only snapshots	2018-07-23 06:44:04 +02:00
Gunnar Morling	2905cc498b	DBZ-800 Don't expect that partition is a Map<String, String>	2018-07-20 13:37:54 +02:00
Gunnar Morling	fe53360f30	DBZ-800 Avoiding "mysql" in commonly used heartbeat key schema	2018-07-20 13:37:54 +02:00
Gunnar Morling	86a4731587	DBZ-800 Emitting heartbeat events in generic event dispatcher	2018-07-20 13:37:54 +02:00
Gunnar Morling	26106cdd8e	DBZ-800 Removing unused method from Heartbeat	2018-07-20 13:37:54 +02:00
Jiri Pechanec	0e35439019	DBZ-40 Snapshotting and connector restart	2018-07-18 14:39:11 +02:00
Gunnar Morling	a5b5e7d9eb	DBZ-627 Using unfified TopicSelector for MongoDB, too	2018-07-18 12:36:07 +02:00
Gunnar Morling	20db9299c5	DBZ-627 Unifying TopicSelector implementations	2018-07-18 12:36:07 +02:00
Gunnar Morling	610da70cc5	DBZ-801 Parameterizing ChangeEventQueue<DataChangeEvent>	2018-07-18 10:30:22 +02:00
Gunnar Morling	80085567ae	DBZ-720 Proper check for previous unfinished snapshot	2018-07-18 10:07:50 +02:00
Gunnar Morling	33170d4c3d	DBZ-40 Pulling up some common code to HistorizedRelationalDatabaseSchema	2018-07-18 10:07:50 +02:00
Gunnar Morling	422a303f87	DBZ-720 Signaling cancellation of snapshot through InterruptedException	2018-07-17 13:11:02 +02:00
Jiri Pechanec	f118e37da0	DBZ-40 Changes to support SQL Server connector	2018-07-13 22:23:00 +02:00
Gunnar Morling	c5e2d5f9e7	DBZ-720 Making scan of single table interruptable; Also better logging.	2018-07-13 19:42:18 +02:00
Gunnar Morling	648d695c08	DBZ-720 Preparing HistorizedRelationalSnapshotChangeEventSource to support data snapshotting	2018-07-13 19:42:18 +02:00
Gunnar Morling	aa1272cc21	DBZ-720 Re-using single instance of streaming event receiver in EventDispatcher	2018-07-13 09:51:15 +02:00
Gunnar Morling	657650c581	DBZ-720 Passing change event emitter directly to dispatch methods; The indirection of going through a supplier wasn't really necessary	2018-07-13 09:51:15 +02:00
Gunnar Morling	a565361e56	DBZ-720 Making SnapshotContext auto-closeable	2018-07-13 09:51:15 +02:00
Gunnar Morling	986df43649	DBZ-720 Extracting HistorizedRelationalSnapshotChangeEventSource from Oracle connector	2018-07-13 09:51:15 +02:00
Gunnar Morling	1523f230ca	DBZ-793 Using TableFilter consistently; Only the MySQL connector's Filters class still uses Predicate for the time being, so to avoid to much merging trouble with DBZ-175.	2018-07-11 13:16:44 +02:00
Jenkins user	f9b8d830a8	[maven-release-plugin] prepare for next development iteration	2018-07-11 07:36:30 +00:00
Jenkins user	290ded678f	[maven-release-plugin] prepare release v0.8.0.Final	2018-07-11 07:36:29 +00:00
Jenkins user	033db6659d	[maven-release-plugin] prepare for next development iteration	2018-07-04 07:07:44 +00:00
Jenkins user	696f35f2c0	[maven-release-plugin] prepare release v0.8.0.CR1	2018-07-04 07:07:44 +00:00
Gunnar Morling	99048365c9	DBZ-771 Unsetting default value when changing a column's type	2018-07-03 11:55:20 +02:00
Gunnar Morling	c7d5288d40	DBZ-771 Making sure causing DDL statement is logged also if default value can't be parsed	2018-07-03 11:55:20 +02:00
Gunnar Morling	dcd6b01134	DBZ-644 Simplifying RelationalDatabaseSchema constructor; Also retrieving logical name via CommonConnectorConfig#getLogicalName() in more places	2018-07-03 07:03:59 +02:00
Gunnar Morling	b0980b994c	DBZ-644 Adding test for Postgres	2018-07-03 07:03:59 +02:00
Gunnar Morling	068aa85bd6	DBZ-644 Misc. clean-up and tests; * Using parameter keys under "__debezium" namespace * More expressive names * Adding tests	2018-07-03 07:03:59 +02:00
Gunnar Morling	8744488c8a	DBZ-644 Unifying retrieval of ColumnMappers in RelationalDatabaseSchema	2018-07-03 07:03:59 +02:00
orr.ganani	ac515b8064	DBZ-644 change Original Data Type mapper and add a test	2018-07-03 07:03:59 +02:00
orr.ganani	747184c572	DBZ-644 add JDBC data type as additional metadata to DBZ events, if configuration is opted in	2018-07-03 07:03:59 +02:00
Gunnar Morling	c4c8cbc3ab	DBZ-773 Moving management of Tables to RelationalDatabaseSchema	2018-06-29 09:09:10 +02:00
Gunnar Morling	bbfbdf6fab	DBZ-773 Moving management of table schemas to RelationalDatabaseSchema	2018-06-29 09:09:10 +02:00
Gunnar Morling	a292b05a96	DBZ-751 Propagating DECIMAL column precision to Avro schemas	2018-06-27 14:48:59 +02:00
Gunnar Morling	b27421ddd0	DBZ-759 Testing via SchemaUtil; * Making RecordWriter private * Removing quotes around string representation of byte arrays	2018-06-26 12:03:53 +02:00
Andreas Bergmeier	82bac78ba8	DBZ-759 Add Test to ensure correct serialization of byte arrays Needed to make RecordWriter public to have access in tests.	2018-06-26 11:50:54 +02:00
Andreas Bergmeier	71abca2afb	DBZ-759 Fix unusable array information in Serialization According to ErrorProne, (implicitly) calling toString on the array does not give useful information. Also applying Base64 encoding is strange here.	2018-06-26 11:50:54 +02:00
Gunnar Morling	8d0d35762e	DBZ-759 Removing superfluous (and broken) ColumnEditorImpl#compareTo() implementation	2018-06-25 17:04:45 +02:00
Gunnar Morling	bf7a5018ca	Setting POM version back to 0.8.0-SNAPSHOT	2018-06-22 12:21:03 +02:00
Gunnar Morling	357102158a	DBZ-759 Fixing broken Document#setArray() method	2018-06-22 11:03:00 +02:00
Gunnar Morling	ce81f37990	DBZ-759 Adding unit test	2018-06-22 09:19:45 +02:00
Andreas Bergmeier	7417baea4d	DBZ-759 Fix equality comparison for BinaryValue If value is binary or string, it should be better to compare the content of the actual byte arrays. According to ErrorProne you use reference equality when calling equals on an array. Part of https://issues.jboss.org/browse/DBZ-759	2018-06-22 09:19:45 +02:00
Andreas Bergmeier	333d6cb57d	DBZ-759 Fix hashCode of BinaryValue According to ErrorProne, the hashCode of a byte array returns useless information. Part of https://issues.jboss.org/browse/DBZ-759	2018-06-22 09:19:41 +02:00
Andreas Bergmeier	d374ca0464	DBZ-759 Fix ColumnImpl.scale equality check According to ErrorProne, checking Optional via operator == leads to reference equality check. Probably want rather to check for contained values.	2018-06-22 09:08:27 +02:00
Jenkins user	db42e4657a	[maven-release-plugin] prepare for next development iteration	2018-06-21 14:07:45 +00:00
Jenkins user	c4b8ecaa99	[maven-release-plugin] prepare release v0.8.0.Beta1	2018-06-21 14:07:45 +00:00
Gunnar Morling	96c59a1568	DBZ-20 Moving debezium-connector-oracle from main to incubator repo	2018-06-20 13:05:37 +02:00
Gunnar Morling	b83f1493a2	DBZ-20 Avoiding Optional as method parameter	2018-06-20 13:05:37 +02:00
Gunnar Morling	dce1aa29f1	DBZ-20 Oracle precision supports negative numbers	2018-06-20 13:05:37 +02:00
Gunnar Morling	f4e3668ebf	DBZ-20 Tune up timeouts/connector/oracle/OracleStreamingChangeEventSource.java	2018-06-20 13:05:37 +02:00
Gunnar Morling	cf6219ccda	DBZ-20 Committing processed offsets in the DB, allowing log files to be released	2018-06-20 13:05:37 +02:00
Jiri Pechanec	661c34fc6b	DBZ-20 Use fluent API in ColumnEditor	2018-06-20 13:05:37 +02:00
Jiri Pechanec	232a1d5573	DBZ-20 Support for day to second interval	2018-06-20 13:05:37 +02:00
Jiri Pechanec	89694ffbbf	DBZ-20 Support for year to month interval	2018-06-20 13:05:37 +02:00
Jiri Pechanec	28b844bf99	DBZ-20 Added ANSI double precision, real and variable scale number datatypes	2018-06-20 13:05:37 +02:00
Jiri Pechanec	10ddb69dea	DBZ-20 A subset of string, time and number datatypes	2018-06-20 13:05:37 +02:00
Gunnar Morling	603f02e70d	DBZ-20 Excluding generated sources from CheckStyle	2018-06-20 13:05:37 +02:00
Gunnar Morling	e8614c7890	DBZ-20 Adding table whitelist/blacklist connector options; * Restructuring OracleSnapshotChangeEventSource into smaller methods	2018-06-20 13:05:37 +02:00
Gunnar Morling	1e87fae023	DBZ-20 Recovering database history	2018-06-20 13:05:37 +02:00
Gunnar Morling	0733110696	DBZ-20 Recording schema history; This records the DDL for DDL events captured during streaming. For the initial schema snapshot, a JSON-style representation of the captured Table objects is used in a new field of HistoryRecord, as the DDL returned by dbms_metadata.get_ddl() isn't fully parseable by our grammar.	2018-06-20 13:05:37 +02:00
Gunnar Morling	1a7786d254	DBZ-20 Further simplifying threading model and stopping logic	2018-06-20 13:05:37 +02:00
Gunnar Morling	d8a9865e51	DBZ-20 Some more docs	2018-06-20 13:05:37 +02:00
Gunnar Morling	225c0f05a6	DBZ-20 Extracting RelationalChangeRecordEmitter base class	2018-06-20 13:05:37 +02:00
Gunnar Morling	979c60d956	DBZ-20 Handling update events	2018-06-20 13:05:37 +02:00
Gunnar Morling	24e22f4985	DBZ-20 Simpler and more robust interruption handling in ChangeEventSourceCoordinator	2018-06-20 13:05:37 +02:00
Gunnar Morling	d7e196a18e	DBZ-20 Initial import of Oracle connector based on XStream	2018-06-20 13:05:37 +02:00
Gunnar Morling	9eb4b90ec9	DBZ-191 Adding record validations	2018-06-15 20:23:17 +02:00
Gunnar Morling	84dd36df46	DBZ-252 Misc typo fixes and clean-up	2018-06-15 16:43:30 +02:00
Gunnar Morling	54ca30624d	DBZ-252 Misc. improvements; * Dedicated getter for DDL mode * Using Objects#equal() * Typo fixes	2018-06-15 16:43:29 +02:00
Jiri Pechanec	9580c8c290	DBZ-252 Rebase to master	2018-06-15 11:42:24 +02:00
rkuchar	7d7f740721	DBZ-252 refactor parser listener	2018-06-15 11:42:24 +02:00
rkuchar	5c9e718436	DBZ-252 uncomment getting ddl parser instance from configuration + improve/refactor dataType resolver for ANTLR	2018-06-15 11:42:24 +02:00
rkuchar	99eac9a8e2	DBZ-252 some test fixes + separate ddl test for antlr parser	2018-06-15 11:42:24 +02:00
rkuchar	22151250fa	DBZ-252 refactor proxy listener delegation	2018-06-15 11:42:24 +02:00
rkuchar	07212043b5	DBZ-252 test fixes + grammar fixes + some bug repairs in antlr parser	2018-06-15 11:42:23 +02:00
rkuchar	1124f81d94	DBZ-252 parse set and use statements	2018-06-15 11:42:23 +02:00
rkuchar	b4480c55b7	DBZ-252 parse database queries (create, alter, delete), parse name tokens without quotes, add new Mysql connector configuration field for choosing ddl parsing mode	2018-06-15 11:42:23 +02:00
rkuchar	5d911d511f	DBZ-252 signal event for truncate table into schema changes	2018-06-15 11:42:23 +02:00
rkuchar	32d181480d	DBZ-252 fix troubles with generalized system variables + add new data types that was missing in grammar	2018-06-15 11:42:23 +02:00
rkuchar	85e8146d1f	DBZ-252 implementation of scope synonym for local and session scope in MySQL	2018-06-15 11:42:23 +02:00
rkuchar	5851d1f07d	DBZ-252 fix initialization of system variables object + add default scope for variales in case that scope will no be set	2018-06-15 11:42:23 +02:00
rkuchar	9311c82bc5	DBZ-252 reformat code to not have anything after closing "}"	2018-06-15 11:42:23 +02:00
rkuchar	782ab75160	DBZ-252 move listener support from base to only legacy implementation, refactor system variables for possible use with other DBMSs + introduce data type resolver for antlr mysql parser	2018-06-15 11:42:23 +02:00
rkuchar	b92e8ef760	DBZ-252 exclude logging of parsed comment	2018-06-15 11:42:23 +02:00
rkuchar	30e158aa71	DBZ-252 create table and drop table listeners	2018-06-15 11:42:23 +02:00
rkuchar	785c346bcb	DBZ-252 Exclude reusable logic from LegacyDdlParser	2018-06-15 11:42:23 +02:00
rkuchar	36e2f33bf6	DBZ-252 - extract creation of DdlParser interface	2018-06-15 11:42:23 +02:00
Jiri Pechanec	29f8891b4f	DBZ-578 Process timestamps without timezones as in server timezone	2018-06-11 15:27:08 +02:00
Jiri Pechanec	de0e1ad4b5	DBZ-702 Mask SASL configuration	2018-06-08 08:57:50 +02:00
Jiri Pechanec	176dcbbc0b	DBZ-709 PostgreSQL uses scale for time precision not length	2018-06-05 15:51:35 +02:00
luobo	5b33d81f00	DBZ-191 support default value of column in mysql	2018-06-04 10:43:53 +02:00
Gunnar Morling	61f576bb90	DBZ-706 Misc. clean-up	2018-06-01 11:23:30 +02:00
Stephen Powis	40d872f48c	DBZ-706 General cleanup	2018-06-01 11:23:30 +02:00
Stephen Powis	867202e0ca	DBZ-706 Slight refactoring / shuffling of code to validate source query field in snapshot tests	2018-06-01 11:23:12 +02:00
Jiri Pechanec	764a1d0d58	DBZ-693 SQL parsing in separate method	2018-05-28 12:35:47 +02:00
Jiri Pechanec	45bc7b1536	DBZ-693 On connect statements supported for PostgreSQL	2018-05-28 12:35:47 +02:00
Jiri Pechanec	f74a9fe059	DBZ-693 Initial SQL statements	2018-05-28 12:35:47 +02:00
Jiri Pechanec	f844054f50	DBZ-696 Support nanoseconds in timestamp data	2018-05-22 11:33:50 +02:00
Jiri Pechanec	09b28c0913	DBZ-694 Support milliseconds for pre-epoch dates	2018-05-22 11:22:06 +02:00
Rao	15a90548fb	DBZ-666 Supporting ordered snapshot using tables.whitelist config	2018-05-17 06:23:31 +02:00
Jiri Pechanec	0f0a5d4cd4	DBZ-687 Kafka 1.1.0	2018-05-14 09:37:33 +02:00
Omar Al-Safi	8ef9267808	DBZ-668 Added unit test and changed the key field name	2018-03-23 10:50:42 +01:00
Omar Al-Safi	21d2e0b8a9	DBZ-668 Changed the key schema for the heartebat messages from STRING to STRUCT	2018-03-23 10:50:42 +01:00
Jiri Pechanec	2b5370de1a	DBZ-667 Sceintific numbers parsing	2018-03-20 14:42:55 +01:00
Gunnar Morling	00b63bb808	DBZ-669 Closing AdminClient instance after usage	2018-03-20 13:07:42 +01:00
Jenkins user	f4e151b23a	[maven-release-plugin] prepare for next development iteration	2018-03-20 08:14:19 +00:00
Jenkins user	93b3252332	[maven-release-plugin] prepare release v0.7.5	2018-03-20 08:14:19 +00:00
Gunnar Morling	5f61cb167b	DBZ-655 Using AssertJ asserts in UnwrapFromEnvelopTest	2018-03-19 09:28:30 +01:00
Gunnar Morling	5358167446	DBZ-655 Adding test	2018-03-19 09:28:30 +01:00
Maciej Bryński	d5abb8415c	DBZ-655 Use cache to avoid using regexp replace	2018-03-19 09:28:23 +01:00
Gunnar Morling	d73c27d9bd	DBZ-657 Adding test	2018-03-18 18:26:27 +01:00
Jiri Pechanec	87bd3e9c2f	DBZ-657 Handle JSON with control chars not according to specs	2018-03-18 18:03:56 +01:00
Jiri Pechanec	9884a5ca1d	DBZ-663 disable cleaning of history topic based on size	2018-03-16 16:14:12 +01:00
Jiri Pechanec	62c7fe12f9	Do not log FLUSH RELAY LOGS statements	2018-03-16 09:51:48 +01:00
Jenkins user	daf27207be	[maven-release-plugin] prepare for next development iteration	2018-03-07 08:31:07 +00:00
Jenkins user	9c73774928	[maven-release-plugin] prepare release v0.7.4	2018-03-07 08:31:07 +00:00
Jiri Pechanec	327085164e	DBZ-611 Special values supported for double	2018-03-06 16:55:24 +01:00
Gunnar Morling	1f385c21f9	DBZ-611 Typo fix and formatting	2018-03-06 16:55:24 +01:00
Jiri Pechanec	424aefbbb4	DBZ-611 Decimal-> String supported for MySQL	2018-03-06 16:55:24 +01:00
Gunnar Morling	401cbae852	DBZ-632 Removing methods from HistoryRecord which where only used in tests; Also adding a new test for HistoryRecord (de-)serialization	2018-03-02 05:28:16 +01:00
Gunnar Morling	d040ec4427	DBZ-351, DBZ-606 Misc. clean-up; * Renaming DebeziumDecimal to SpecialValueDecimal * Simplifying logic in PostgresValueConverter	2018-02-27 16:36:21 +01:00
Jiri Pechanec	2534696b21	DBZ-606 Exception thrown for special values when mode handling is not string	2018-02-27 16:36:21 +01:00
Jiri Pechanec	5827a7ac1b	DBZ-611 Decimals can be encoded as strings	2018-02-27 16:36:21 +01:00
Jiri Pechanec	c71707afc9	DBZ-351 wal2json usses unlimited precision too	2018-02-27 16:36:21 +01:00
Jiri Pechanec	0c0ac72568	DBZ-606 Type for handling special BigDecimals	2018-02-27 16:36:21 +01:00
Jiri Pechanec	ceaa7f8efe	DBZ-351 Numeric is passed with unlimited precision	2018-02-27 16:36:21 +01:00
Gunnar Morling	d89f4e91c6	DBZ-632 Removing unused method parameter	2018-02-23 21:54:41 +01:00
Gunnar Morling	eef1ad7c2a	DBZ-630 Refactoring around MongoDB task context; * Renaming ConnectorTaskContext to CdcSourceTaskContext * Renaming ReplicationContext to MongoDbTaskContext * Making relationship from MongoDbTaskContext to ConnectionContext has-a instead of is-a	2018-02-21 12:04:41 +01:00
Gunnar Morling	bb12e521f8	DBZ-630 Pulling up getClock() to ConnectorTaskContext	2018-02-21 12:04:41 +01:00
Gunnar Morling	8264c20cf6	DBZ-630 Unifying common start-up logic across connectors	2018-02-21 12:04:41 +01:00
Gunnar Morling	7fe45db831	DBZ-626 Renaming AvroValidator to SchemaNameAdjuster; Making apparent that this functionality doesn't solely validate but returns a new name actually	2018-02-20 09:26:47 +01:00
Gunnar Morling	cddd5fcc8a	DBZ-626 Passing schema validator as AvroValidator instance instead of Function	2018-02-20 09:26:47 +01:00
Jiri Pechanec	6e00924ee9	DBZ-625 Fix parsing of float without decimals	2018-02-20 08:47:43 +01:00
Gunnar Morling	b5856f37c7	DBZ-628 Unrelated typo fix	2018-02-20 06:34:42 +01:00
Gunnar Morling	8369f10d52	DBZ-628 Adding MAX_QUEUE_SIZE, MAX_BATCH_SIZE and POLL_INTERVAL_MS to CommonConnectorConfig; Also using these options in PG connector	2018-02-20 06:34:42 +01:00
Gunnar Morling	5bc62dc862	DBZ-620 Removing method only used in tests	2018-02-19 12:18:37 +01:00
Gunnar Morling	ab11f48843	DBZ-620 Creating envelope schemas only once per table type	2018-02-19 12:18:37 +01:00
Jenkins user	6d0cd88e12	[maven-release-plugin] prepare for next development iteration	2018-02-15 04:15:34 +00:00
Jenkins user	7d1e1a989e	[maven-release-plugin] prepare release v0.7.3	2018-02-15 04:15:34 +00:00
Gunnar Morling	b1362f9db9	DBZ-278 Extracting broker config retrieval into its own method	2018-02-13 17:12:11 +01:00
Jiri Pechanec	90e4aa0dea	DBZ-278 Fail if topic is not created	2018-02-13 16:48:42 +01:00
Jiri Pechanec	4541fc53c9	DBZ-278 Explicit history storage initialization	2018-02-13 16:48:42 +01:00
Jiri Pechanec	ad7b92310e	DBZ-278 Test for broker without auto-created topics	2018-02-13 16:48:42 +01:00
Jiri Pechanec	55e81c458b	DBZ-278 Create database history topic	2018-02-13 16:48:42 +01:00
Jiri Pechanec	5a0c7c07b0	DBZ-609 DBZ-571 Refactor Postgres type handling TypeRegistry introduced for Postgres connector JDBC column does not have a special componentType JDBC column provide a database specific type id OID is a primary type identifier to be used in Postgres connector code - dropping JDB/OID dichotomy	2018-02-13 14:54:42 +01:00
Jiri Pechanec	dbac1429b2	DBZ-581 Improve error handling for replicators	2018-02-13 14:40:54 +01:00
Jiri Pechanec	ecdce6529d	DBZ-220 Null Heartbeat object. Messages contains key	2018-02-13 12:26:48 +01:00
Jiri Pechanec	682cabb84a	DBZ-220 Generate heartbeat events in binlog thread	2018-02-13 12:24:51 +01:00
Gunnar Morling	96b4023839	DBZ-220 Misc. improvements; * interpreting heartbeat interval as MS * adding Configuration#getDuration()	2018-02-13 12:24:51 +01:00
Jiri Pechanec	3c1ea8206b	DBZ-220 Refactor heartbeat to a separate class	2018-02-13 12:24:51 +01:00
Gunnar Morling	2e0b9e5e33	DBZ-582 Moving TOMBSTONES_ON_DELETE to CommonsConnectorConfig for re-use	2018-02-13 10:19:18 +01:00
Jiri Pechanec	d0bd1560ff	DBZ-576 Make DDL filter internal parameter	2018-02-12 11:22:28 +01:00
Jiri Pechanec	3740aee4e8	DBZ-576 Internal parameters	2018-02-12 11:22:28 +01:00
Gunnar Morling	a77c8d733a	DBZ-580, DBZ-586 Making enqueue() and thus snapshots in Postgres interruptable; * ChangeEventQueue#enqueue() checks the interrupted state of the calling thread now, raising an InterruptedException in case the interrupted flag has been set (because the producer's thread executor has been stopped) * RecordSnapshotProducer has been adjusted to check for the interrupted regularly, aborting if it has been set	2018-02-11 10:05:06 +01:00
Gunnar Morling	45a9847d42	DBZ-580 Initial implementation of dedicated change message queue: * To be used across connectors, unifying the enqueuing/polling logic and handling of related config options * Using it for Postgres connector	2018-02-11 10:05:06 +01:00
Gunnar Morling	2a724d9611	DBZ-537 Misc. adjustments: * Renaming ConfigurationHelper to Instantiator * Doc improvements and typo fixes * Bringing getInstance() methods into consistent order * Raising exception instead of logging error if instantion fails	2018-02-09 15:45:57 +01:00
Jiri Pechanec	5be55ae8ff	DBZ-537 OffsetCommitPolicy supports Configuration constructor initialization	2018-02-09 15:19:48 +01:00
Gunnar Morling	7b6db00db7	DBZ-593 JavaDoc	2018-02-02 13:16:43 +01:00
Gunnar Morling	712824e43f	DBZ-593 Passing version explicitly from SourceInfos to base class instead of relying specific packaging structure	2018-02-02 13:16:43 +01:00
Jiri Pechanec	7ebee94169	DBZ-593 Add Debezium version to source in envelope	2018-02-02 13:16:43 +01:00
Gunnar Morling	3a281d1185	DBZ-595 Making sure resources are cleaned up when snapshotting fails; * shutting down the snapshotting thread and the DB history producer client if the connector is stopped while trying to write to the history topic * reducing the time that KafkaProducer#send() will block if Kafka isn't available; this will release the producer thread quicker in case the connector is stopped during snapshotting * not returning from finally block (!) in case the TX is rolled back; This prevented the failed state to be set by the outer catch clause in execute()	2018-02-02 09:06:32 +01:00
Jiri Pechanec	92740a3626	DBZ-587 Fix thread leak, thread names	2018-02-01 10:04:20 +01:00
Jiri Pechanec	9b592204ac	DBZ-587 Centralize and unify thread management	2018-02-01 10:04:20 +01:00
Jenkins user	04624341f5	[maven-release-plugin] prepare for next development iteration	2018-01-25 09:39:44 +00:00
Jenkins user	898f6884e1	[maven-release-plugin] prepare release v0.7.2	2018-01-25 09:39:44 +00:00
Gunnar Morling	8edcf9f3d8	DBZ-507 Making "wkb" optional in Point schema to keep compatability with previous version	2018-01-24 16:51:54 +01:00
Robert Coup	f9d90a482f	DBZ-507 Expanding support for geometry types; * Adding support for PostGIS geometry types * Adding support for GEOMETRY, POLYGON and more in MySQL * For all newly supported types, changes are represented using two new schema types Geometry and Geography, containing the WKB (binary geo data) ans srid (coord system identifier) * The existing Point type also contains the new (optional) srid field	2018-01-24 16:51:49 +01:00
Jiri Pechanec	01577b40c3	DBZ-541 Only whitelisted tables are recorded in history	2018-01-24 14:43:57 +01:00
Gunnar Morling	ea4b366b7f	DBZ-538 Capturing unparseable DDL for logging in one more place	2018-01-24 11:30:51 +01:00
Peter Goransson	7d78357087	DBZ-443 Adding db-history recovery mode: schema_only_recovery	2018-01-23 09:17:39 +01:00
Gunnar Morling	0c4190c493	DBZ-516 Using Duration instead of long in a few more places	2018-01-18 14:13:58 +01:00
Jiri Pechanec	24bdcaf059	DBZ-516 Return control to Connect periodically	2018-01-18 14:13:58 +01:00
Gunnar Morling	9f433b1e24	DBZ-544 Comment fix	2018-01-17 15:16:37 +01:00
Jiri Pechanec	cdf44faba4	DBZ-544 Log matching DDL filter	2018-01-17 15:16:37 +01:00
Gunnar Morling	4d1d016acb	DBZ-392 Making table map and set dedicated classes just with the needed API	2018-01-17 11:21:02 +01:00
Jiri Pechanec	9f0f0fe5d4	DBZ-392 Support for MySQL running on lowercase filesystems	2018-01-17 11:21:02 +01:00
Gunnar Morling	b99bdf7fdc	DBZ-543 Removing a few unused methods	2018-01-15 14:37:46 +01:00
Gunnar Morling	5c88431c07	DBZ-494 Making tests more lenient towards specific List implementations; also fixing a few typos.	2018-01-15 10:40:50 +01:00
Tom Bentley	5b839d3665	DBZ-540 Adding KafkaCluster#zkPort() If you start a cluster (e.g. in a test) without specifying a port you get a random port. Sometimes you might want to connect to the embedded zookeeper instance (for instance, to make an assertion about a znode). To do this you need to know the port number. So let's expose it.	2018-01-12 18:25:41 +01:00
Jiri Pechanec	0b269a6e41	DBZ-538 Improve invalid DDL statement reporting	2018-01-12 12:26:30 +01:00
Jiri Pechanec	2a7377a833	DBZ-455 Use valueOf instead of constructors	2018-01-05 02:32:37 +01:00
Jenkins user	6bb34b42f9	[maven-release-plugin] prepare for next development iteration	2017-12-20 07:15:12 +00:00
Jenkins user	16dcd4c980	[maven-release-plugin] prepare release v0.7.1	2017-12-20 07:15:12 +00:00
Jenkins user	5e09932cb9	[maven-release-plugin] prepare for next development iteration	2017-12-15 05:10:23 +00:00
Jenkins user	6c1d61e03b	[maven-release-plugin] prepare release v0.7.0	2017-12-15 05:10:23 +00:00
Gunnar Morling	4e8cedd094	DBZ-379 Postgres connector minimizes use of JDBC metadata	2017-12-13 12:20:37 +01:00
Jiri Pechanec	5ae236241b	DBZ-469 Fixed a backslash regex in testcase	2017-12-11 16:58:23 +01:00
Gunnar Morling	8e99dc3abd	DBZ-469 Adding one more test case	2017-12-11 16:58:23 +01:00
Gunnar Morling	1c55e41941	DBZ-469 Renaming listOfRegex() to setOfRegex(); * Simplifying test of that method * Adding test to ensure correctness of default DDL pattern	2017-12-11 16:58:23 +01:00
Jiri Pechanec	86d9e109fc	DBZ-469 Filter our RDS heartbeat INSERT statements	2017-12-11 16:58:23 +01:00
rkerner	c7ac481c43	[DBZ-342] fix broken MySQL data type "TIME" handling	2017-11-29 20:34:12 +01:00
Gunnar Morling	a55227aa83	DBZ-464 Don't stop after reaching max retry count, but raise an exception instead; Also increasing default value, as the connector can't start its work without fully recovering the history	2017-11-28 08:47:27 +01:00
Gunnar Morling	6537d904ce	DBZ-464 Reading until end offset of history topic	2017-11-28 08:47:27 +01:00
Gunnar Morling	bc2d0e5956	DBZ-464 Removing unused method parameters from AbstractDatabaseHistory#recoverRecords()	2017-11-28 08:47:27 +01:00
Jiri Pechanec	20a2cdfdea	DBZ-476 Doubled quotes are parsed as escaped	2017-11-23 14:51:51 +01:00
Jiri Pechanec	57e7f84163	DBZ-479 Forced fsync slows down tests	2017-11-22 14:34:59 +01:00
Gunnar Morling	2b3276be1d	DBZ-478 Correctly handling null value converters; * using simple for loop for the sake of easier debugging * log info about unsupported column type on WARN rather than TRACE level	2017-11-22 09:43:10 +01:00
David Szabo	1c07ff4775	DBZ-466 Remove hardcoding of schema version number, leaving it empty instead	2017-11-16 17:43:14 +01:00
Jiri Pechanec	4d253d2987	DBZ-437 String tokens and SQL words are handled separately in procedure parsing	2017-11-14 09:40:54 +01:00
Jiri Pechanec	5e7714eb00	DBZ-456 Reduce text execution time	2017-11-14 06:47:32 +01:00
Henryk Konsek	be40502cc7	DBZ-323 Cluster and server properties should be added into config, not set as default ones	2017-11-13 16:35:57 +01:00
Gunnar Morling	2744f54e9d	DBZ-438 Renaming BufferedBlockingConsumer#flush() to close() to make clear it's onyl meant to be called once at the end	2017-11-13 14:19:39 +01:00
Gunnar Morling	2b62943e4f	DBZ-438 Avoiding issues due to concurrent usage of BufferedBlockingConsumer	2017-11-13 14:19:39 +01:00
Gunnar Morling	5fbe742be8	DBZ-285 Specifying scope of dependencies in the individual POMs for the sake of comprehensibility	2017-11-10 16:48:32 +01:00
Gunnar Morling	580647b226	DBZ-285 Making more dependencies "provided"	2017-11-10 16:33:02 +01:00
Ewen Cheslack-Postava	8826669b43	DBZ-285: Use provided or test dependencies for Connect and Kafka dependencies	2017-11-04 12:01:24 -07:00
Jiri Pechanec	a6bd883857	DBZ-432 Rebased to Kafka 1.0.0	2017-11-03 11:06:18 +01:00
Jiri Pechanec	e841cbd609	DBZ-429 Fixed signed default value handling	2017-11-03 06:13:05 +01:00
Gunnar Morling	3f410c20e5	DBZ-422 Using existing constants BigDecimal.ONE and ZERO; Also using valueOf() methods to benefit from internal caching for most numeric types	2017-11-02 12:23:06 +01:00
pmx	3c91ad47ec	DBZ-422 Extracting numeric constants and boolean conversions into NumberConversions class	2017-11-02 12:12:54 +01:00
pmx	c3ce7c571a	DBZ-422 Making sure convertSmallInt() always returns a short	2017-11-02 12:12:54 +01:00
Jiri Pechanec	f93e1e9bcd	DBZ-395 Make temp table regex stricter	2017-10-26 11:11:22 +02:00
Jiri Pechanec	1c6e652c71	DBZ-395 Filter out DROP TEMPORARY TABLE statements	2017-10-26 11:11:22 +02:00
Ben Williams	a3b4fedd5f	DBZ-363 Add support for BIGINT UNSIGNED handling for MySQL	2017-10-18 10:20:03 +02:00
Gunnar Morling	73189892b3	DBZ-258 Misc. improvements while reviewing the change; * Removing superfluous config option * Making loggers static * JavaDoc fixes * Extracting hexStringToByteArray() to helper and adding test * Removing superfluous super() invocations	2017-10-18 09:21:22 +02:00
Jiri Pechanec	e47b4cb81c	DBZ-258 Changes after first review	2017-10-18 09:21:22 +02:00
Jiri Pechanec	0bc8129961	DBZ-258 Support for wal2json plugin	2017-10-18 09:21:22 +02:00
Jenkins user	75937711fa	[maven-release-plugin] prepare for next development iteration	2017-09-21 04:42:02 +00:00
Jenkins user	a89b9332e4	[maven-release-plugin] prepare release v0.6.0	2017-09-21 04:42:02 +00:00
Gunnar Morling	7f20aca03d	DBZ-226 Minor wording and formatting improvements	2017-09-20 15:11:13 +02:00
Jiri Pechanec	0afb687303	DBZ-226 SMT to change CDC to simple record	2017-09-20 12:05:12 +02:00
Jiri Pechanec	ba3d7d762b	DBZ-318 Support for a Decimale with variable scale schema	2017-09-19 08:39:07 +02:00
Gunnar Morling	99f39038bb	DBZ-347 Formatting	2017-09-18 17:04:14 +02:00
Jiri Pechanec	61ccefcc5c	DBZ-347 Configurable behaviour for unknown DDL, failing by default	2017-09-18 17:02:21 +02:00
Jiri Pechanec	f7736cd0ab	DBZ-341 Ignore malformed messages coming from database history	2017-09-18 17:02:21 +02:00
Jiri Pechanec	01980cf29b	DBZ-341 Ignore malformed messages coming from database history	2017-09-15 11:52:18 +02:00
Jiri Pechanec	b53b4f67d2	DBZ-319 Views are now ignored in snapshots	2017-09-15 11:37:38 +02:00
Jiri Pechanec	e4bc6670c8	DBZ-305 Rebase build process against Kafka 0.11	2017-08-17 18:51:01 +02:00
Jenkins user	214696ef0c	[maven-release-plugin] prepare for next development iteration	2017-08-17 11:51:05 +00:00
Jenkins user	c867e6fea6	[maven-release-plugin] prepare release v0.5.2	2017-08-17 11:51:05 +00:00
Gunnar Morling	1665026c8a	DBZ-298 Adding support for quoted identifiers for schemas, tables and columns; expanding test	2017-08-17 09:08:17 +02:00
Gunnar Morling	564d90dc09	DBZ-298 Formatting	2017-08-17 09:03:16 +02:00
Gunnar Morling	a4dc7620fe	DBZ-327 Simplifying validation implementation a bit via Strings#isNullOrEmpty()	2017-08-16 14:37:01 +02:00
Mario Mueller	2a2f911e74	DBZ-327 Fix broken SMT validation and add some tests for the configuration combinations	2017-08-16 11:57:48 +02:00
Emrul Islam	bb79b2f60f	DBZ-297 Initial support for some Postgres array types (one-dimensional only)	2017-07-27 10:04:13 +02:00
Gunnar Morling	79fbc028a8	DBZ-311 Precompiling and simplifying some regular expressions	2017-07-25 21:43:26 +02:00
Gunnar Morling	825530256e	DBZ-311 Removing trailing whitespace	2017-07-25 21:43:26 +02:00
Gunnar Morling	a8d1817c22	[maven-release-plugin] prepare for next development iteration	2017-06-09 16:14:31 +00:00
Gunnar Morling	3f512aace7	[maven-release-plugin] prepare release v0.5.1	2017-06-09 16:14:31 +00:00
Gunnar Morling	2a22f3d4b0	DBZ-254 Right-padding values for fixed-length BINARY columns with 0x00 (zero byte) characters for MySQL; Also fixing JDBC types for binary data types for MySQL.	2017-05-30 15:04:38 +02:00
Tom Bentley	96eda35da5	Respect any existing auto.create.topics.enable when creating a server	2017-05-23 16:58:12 +01:00
dleibovic	a683629519	Change ByLogicalTableRouter logging from info to debug This can be a bit spammy in the info level	2017-05-19 12:09:40 -04:00
Randall Hauch	02e655fa53	DBZ-198 Add another DDL parser test case Added an additional test that is unable to reproduce the problem reported on April 4.	2017-05-19 15:50:03 +02:00
Jiri Pechanec	ec915c559c	[DBZ-246] Correctly handle enumeratedvale type	2017-05-09 11:24:28 +02:00
Gunnar Morling	1158833524	DBZ-238 Removing some trailing whitespace	2017-05-05 21:59:06 +02:00
Gunnar Morling	64958d11db	DBZ-238 Moving EnumeratedValue to its own file	2017-05-05 21:55:46 +02:00
Brendan Maguire	928c9cb5f0	DBZ-238: Fix Postgres database.sslmode = "require" bug * Also added EnumeratedValue interface for Configuration so getValue is used instead of toString when getting the config value	2017-05-05 21:55:45 +02:00
Gunnar Morling	1074de4efa	DBZ-222 Some more clean-up	2017-05-04 09:25:25 +02:00
Omar Al-Safi	ea092f1baf	DBZ-222 Add null to Point comment	2017-05-04 08:53:46 +02:00
Gunnar Morling	5630b61be6	DBZ-222 dependency clean-up	2017-05-04 08:53:05 +02:00
Omar Al-Safi	791545c5f4	DBZ-222 Added support for MySQL POINT type	2017-05-04 08:53:05 +02:00
Randall Hauch	d42736e065	DBZ-121 Corrected JavaDoc and reorganized imports	2017-04-04 16:53:35 -05:00
dleibovic	8e304c4cba	rename fields and simplify descriptions	2017-04-03 11:35:33 -04:00
dleibovic	fa157b0159	make the name of the physical table identifier field added to the key schema configurable	2017-03-31 13:09:32 -04:00
dleibovic	d7acda04c4	DBZ-121: routing to custom topic names via Kafka Connect Transformations.	2017-03-30 15:02:59 -04:00
Randall Hauch	709cd8f3fe	[maven-release-plugin] prepare for next development iteration	2017-03-27 11:28:12 -05:00
Randall Hauch	2bc3d45954	[maven-release-plugin] prepare release v0.5.0	2017-03-27 11:28:11 -05:00
Randall Hauch	81f62b6961	DBZ-205 Corrected MySQL connector to handle 2-digit years MySQL has special handling of 2-digit years that it deems are ambiguous, such as the year value `17` that is actually treated as `2017`. Apparently the 2-digit values are stored in MySQL and the interpretation is performed when the data is extracted, so therefore the connector needs to also perform this adjustment of the year values. This commit uses the JDK’s `TemporalAdjuster` interface and passes this down to the requisite temporal-related datatype handling code. The MySQL connector then provides its own `TemporalAdjuster` implementation that adjusts the year values via the excellend JDK `Temporal` methods. A row in one of the MySQL test databases was changed to use a 2-digit year of `16` while the test method still checks that the year is still 2016`, verifying that the year value is properly adjusted.	2017-03-27 10:58:21 -05:00
Randall Hauch	613e6b6340	Fixed compilation warnings	2017-03-27 10:37:15 -05:00
Randall Hauch	7a72ed6ae6	Merge pull request #202 from don41382/upgrade-kafka-version-to-0.10.2.0 DBZ-203 Upgrade kafka version from 0.10.1.1 to 0.10.2.0	2017-03-17 17:21:48 -05:00
Randall Hauch	430d756062	[maven-release-plugin] prepare for next development iteration	2017-03-17 15:41:58 -05:00
Randall Hauch	536cbf6300	[maven-release-plugin] prepare release v0.4.1	2017-03-17 15:41:57 -05:00
Randall Hauch	e5ee3847dd	DBZ-195 Added tests to try to replicate a reported issue Added a table and inserted rows that tries to replicate the problem reported in DBZ-195, but the test was unable to replicate the problem. In fact, this really is no different than existing tests. Changed the log messages so that if/when this happens again it will be possible to know which row was problematic.	2017-03-17 10:47:32 -05:00
Felix Eckhardt	5d414c521a	upgraded kafka version from 0.10.1.1 to 0.10.2.0	2017-03-17 11:36:15 +11:00
Randall Hauch	cf5391482a	DBZ-198 Improved MySQL DDL parser to better handle blocks The MySQL parser now properly handles control blocks such as `BEGIN…END`, `IF…END IF`, `REPEAT…END REPEAT`, and `LOOP…END LOOP`, even in cases where the block is preceded by and terminated by a label.	2017-03-16 13:32:21 -05:00
Randall Hauch	b48ccce4b5	DBZ-200 Corrected MySQL DDL parser to better handle column definitions Apparently not all reserved words must be quoted when using them as colum names, so refactored MySQL’s DDL parser to better handle a variety of unquoted colum names that are reserved words.	2017-03-08 12:12:27 -06:00
Randall Hauch	d2986710a5	DBZ-188 More efficient GTID source filters for MySQL Connector Changed the GTID source filters in the MySQL connector to be far more efficient when the filters specify literal UUIDs rather than regex patterns. In these cases, the predicate just checks whether a supplied value is in a hash set, and no regular expression patterns are used. The GTID source filters can still be a combination of UUID literals and regular expressions, and the predicate will use the best implementation for each. For example, if the filters include all UUID literals, then regular expressions will never be used.	2017-02-10 11:34:24 -06:00
Randall Hauch	8c60c29883	[maven-release-plugin] prepare for next development iteration	2017-02-07 14:22:12 -06:00
Randall Hauch	20134286e9	[maven-release-plugin] prepare release v0.4.0	2017-02-07 14:22:11 -06:00
Randall Hauch	74e5ba6448	DBZ-176 Corrected MySQL DDL parser to support creating triggers with definers The MySQL DDL parser was not correclty handling `DEFINER` clauses within `CREATE TRIGGER` or `CREATE EVENT` statements. Support for `DEFINER` clauses was recently added for the various forms of `CREATE PROCEDURE`, `CREATE FUNCTION` and `CREATE VIEW` statements. These are the only kinds of statements that have the definer attribute, per the [MySQL documentation](https://dev.mysql.com/doc/refman/5.7/en/stored-programs-security.html).	2017-02-02 12:44:28 -06:00
Randall Hauch	972cfbe2c4	DBZ-173 Additional fixes to KafkaDatabaseHistory class for Kafka 0.10.1.0 The KafkaDatabaseHistory class was not behaving well in tests using my local development environment. When restoring from the persisted Kafka topic, the class would set up a Kafka consumer and see repeated messages. It is unclear whether the repeats were due to our test environment and very short poll timeouts. Regardless, the restore logic was refactored to track offsets so as to only process messages once.	2017-02-01 14:47:41 -06:00
Horia Chiorean	d035c4bc8d	DBZ-173 Changes the MySQL ITs to not use TZ information for expected dates and fixes the character set for parsing test files	2017-01-27 14:53:10 +02:00
Horia Chiorean	a2154d3d32	DBZ-173 Changes the MySQL ITs to use the database.hostname system property instead of always hardcoding 'localhost'	2017-01-27 09:19:57 +02:00
Horia Chiorean	7dfdef3558	DBZ-173 Upgrades the Kafka artifact versions to 0.10.1.1	2017-01-27 09:19:57 +02:00
Horia Chiorean	031c4a1552	DBZ-183 Fixes the BinlogReader's handling of TIMESTAMP columns to correctly account for timezones	2017-01-25 16:39:36 +02:00
Randall Hauch	f0db8d1b1f	DBZ-179 Corrected JavaDoc in PostgreSQL connector Corrected the JavaDoc and removed trailing spaces in the PostgreSQL connector code.	2017-01-20 11:51:17 -06:00
Randall Hauch	a73f85a80f	Merge pull request #162 from rareddy/DBZ-177 DBZ-177: Providing an alternative way to create JDBC connection based …	2017-01-13 13:37:38 -06:00
Ramesh Reddy	a9aace3480	DBZ-177: Providing an alternative way to create JDBC connection based on the configured JDBC driver class name and supplied classloader. The loading/creating the JDBC connections is not reliable when driver libraries in a different classloader than the DriverManager.	2017-01-13 12:58:14 -06:00
Horia Chiorean	ae85656851	DBZ-3 Fixes topic naming to include the name of the server	2016-12-30 09:42:48 +02:00
Horia Chiorean	737614a555	DBZ-3 Implements a connector for streaming changes from a Postgres database The version of the DB server required for this to work is at least 9.4. To be able to stream logical changes, the code relies on enhancements to the JDBC driver which are not yet public. Therefore, the current codebase includes the sources for the JDBC driver. The commit also updates the general DBZ build system for: * custom checkstyle package exclusions - required by the Postgres driver the protobuf code for now * adds support for debugging Surefire and Failsafe	2016-12-27 14:44:32 +02:00
Horia Chiorean	23e3f59fa1	DBZ-3 Implements a connector for streaming changes from a Postgres database The version of the DB server required for this to work is at least 9.4 The commit also updates the general DBZ build system for: * custom checkstyle package exclusions - required by the Postgres driver the protobuf code for now * adds support for debugging Surefire and Failsafe	2016-12-27 14:44:32 +02:00
Horia Chiorean	8e14f150db	DBZ-3 Adds the structure for a Postgres connector which uses a Debezium Postgres docker image that has the decoderbufs plugin enabled to read WAL changes	2016-12-27 14:44:29 +02:00
Randall Hauch	5dceb05f69	DBZ-151 Additional changes to improve test framework and MySQL integration tests	2016-12-20 10:58:56 -06:00
Randall Hauch	a3bece4472	DBZ-151 Added new integration test framework for easily comparing output of connectors to expected results.	2016-12-20 09:18:09 -06:00
Randall Hauch	0851d8280c	DBZ-166 Corrected shutdown logic of MySQL connector The MySQL connector uses several threads, so previously upon connector shutdown these threads were simply cancelled. This is fine for the binlog reader (which can stop at any moment), but is a poor approach for the snapshot as we didn’t always properly release the database resources and also didn’t complete the writing of the DDL history. With this change, the snapshot reader stops in a very controlled manner, basically by having the 10-step snapshot procedure frequently check whether the reader is to continue working, and to completely avoid thread interruption altogether. And, the snapshot procedure will always clean up its database resources (locks, transactions, etc.), even if the procedure is stopped before completion. This change also refactors how the snapshot and binlog reader are managed. This is no longer done in the MySqlConnectorTask class (which is busy enough), but rather the logic has been encapsulated in a new `ChainedReader` that makes use of a new `Reader` interface. This makes testing of `ChainedReader` easier, and ensure that `ChainedReader` relies only upon the primary methods of `Reader` rather than upon `AbstractReader`. `ChainedReader` handles multiple readers generically, and ensures that when stopped the readers are all handled correctly and completely process all records, yet avoid accidentally starting a subsequent reader(s) when stopping the previous reader.	2016-12-15 10:55:18 -06:00
Randall Hauch	c762a221b7	DBZ-162 Corrected DDL parsing of MySQL functions The MySQL DDL parser was not properly consuming function declarations. For functions, the parser consumes the entire statement without handline the various expressions within the function declaration, but the parser was not properly finding the end of the statement and instead was continuing to try to consume values beyond the end of the statement. Specifically, when the parser consumes a `BEGIN`, it looks for a corresponding `END`. However, if it encountered an `END IF`, the `IF` plus any remaining tokens were left on the token stream and unprocessed. This confused the parser, which keep looking for statements and ultimately ended with a `No more content` error. This case was replicated in integration tests, and the code fixed to properly find the end of the statements.	2016-12-06 17:34:52 -06:00
Sherafudheen PM	ee52219736	DBZ-160 - Issue while parsing create table script with ENUM type and default value 'b'	2016-12-02 17:42:44 +05:30
Randall Hauch	d80bc1bfd7	DBZ-153 MySQL connector supports enum and set values with parentheses Changed the MySQL connector to support ENUM and SET literals with parentheses.	2016-11-14 12:22:08 -06:00
Randall Hauch	b0ded5f383	DBZ-147 Added ability to treat MySQL DECIMAL as double By default the MySQL connector handles `DECIMAL` and `NUMERIC` columns using `java.math.BigDecimal` values and describing them using the `org.apache.kafka.connect.data.Decimal` schema type, which serializes the values to a binary form. This change adds a configuration option that will keep the default behavior, but will instead allow handling `DECIMAL` adn `NUMERIC` values as Java `double` and a schema type of `FLOAT64`.	2016-11-09 11:27:09 -06:00
Randall Hauch	ea5f7983c7	DBZ-144 Corrected MySQL connector restart Added tests to verify whether the connector is properly restarting in the binlog when previously the connector failed or stopped in the middle of a transaction. The tests showed that the connector is not able to properly start when using or not using GTIDs, since restarting from an arbitrary binlog event causes problems since the TABLE_MAP events for the affected tables are skipped. The logic was changed significantly to record in the offsets the binlog coordinates at the start of the transaction, which should work whether or not GTIDs are used. Upon restart, the connector may have to re-read the events that were previously processed, but now the offset also includes the number of events that were previously processed so that these can be skipped upon restart. This has an unforunate side effect since the offsets capture a transaction was completed only when it generates a source record for the subsequent transaction. This is because the connector generates source records (with their offsets) for the binlog events in the transaction before the transaction's commit is seen. And, since no additional source records are produced for the transaction commit, the recorded offsets will show that the prior transaction is complete and that all of the events in the subsequent transaction are to be skipped. Thus, upon restart the connector has to re-read (but ignore) all of the binlog events associated with the completed transaction. This shouldn’t be a problem, and will only slow restarts for very large transactions.	2016-11-09 08:11:41 -06:00
Randall Hauch	207315e5df	DBZ-146 Improved error handling of MySQL Connector Improved the error handling of the MySQL connector to ensure that we’re always stopping the connector when we have a problem handling a binlog event or if we have problems starting.	2016-11-03 16:55:59 -05:00
Randall Hauch	99a86ad289	Merge pull request #112 from rhauch/dbz-123 DBZ-123 Corrected the MySQL DDL parser to properly handle bit-set literals	2016-10-07 17:16:37 -05:00
Randall Hauch	332de18384	Corrected headers	2016-10-07 17:16:27 -05:00
Randall Hauch	beb47dd2de	DBZ-131 Improved logging while reading binlog When the MySQL connector is reading the binlog, it outputs INFO log messages reporting status at an exponentially-increasing rate, starting at every 5 seconds and doubling until a max period of 1 hour. This output is useful when the connector starts to know that it is working, but thereafter the usefulness decreases. Once an hour is probably acceptable output. This is not intended to replace the capturing of metrics, but is merely an aid to easily tell via the logs whether the connector continues to work. Also improved the log message when the binlog reader stops to capture the total number of events recorded by Kafka Connect and the last recorded offset.	2016-10-07 17:10:01 -05:00
Randall Hauch	50eb4094ac	DBZ-123 Corrected the MySQL DDL parser to properly handle bit-set literals The DDL parser now properly handles bit-set literals, and several minor case-sensitivity bugs dealing with other escaped literals.	2016-10-06 13:25:38 -05:00
Randall Hauch	730603976d	Merge pull request #107 from rhauch/dbz-123 DBZ-123 Corrected MySQL Connector's support for BIT(n) columns	2016-09-21 15:22:00 -05:00
Randall Hauch	3e2d953b1a	Merge pull request #103 from rhauch/dbz-122 DBZ-122 Prevent logging of password configuration property values	2016-09-21 15:15:02 -05:00
Randall Hauch	bcf60940db	DBZ-123 Corrected MySQL Connector's support for BIT(n) columns Corrected how the MySQL connector is treating columns of type `BIT(n)`, where _n_ is the number of bits in the value. When `n=1`, the resulting values are booleans; when `n>1`, the resulting values are little endian `byte[]` that have the minimum number of bytes to hold the `n` bits.	2016-09-21 15:04:20 -05:00
Randall Hauch	9aae6c62d9	DBZ-124 Eliminated the JMX "already registered" warning in the MySQL connector The `KafkaDatabaseHistory` was always creating a new producer whenever its `start()` method was called, even if it were called more than once. And, the `MySqlSchema` was calling `start()` twice, resulting in multiple producers being created and registered with JMX. Both issues were fixed. Also, UUIDs were being used as the name of the JMX MBean for the producer, unless the `database.history.consumer.client.id` and `database.history.producer.client.id` properties were being explicitly set. Now, the MySQL connector will by default set the `client.id` property on both the database history's Kafka consumer and producer to `{connectorName}-dbhistory`. Of course, the `database.history.consumer.client.id` and `database.history.producer.client.id` properties can still be set to define the name of the producer and consumer.	2016-09-21 10:05:15 -05:00
Randall Hauch	54b737edc1	DBZ-114 MySQL connector now handles "zero-value" dates and timestamps MySQL supports "zero-value" dates and timestamps, but these cannot be represented as valid dates or timestamps using the Java types. For example, the zero-value `0000-00-00` for a date has what Java considers to be an invalid month and day-of-the-month. This commit changes how the MySQL connector handles these values to not throw exceptions. When columns allow nulls, such values will be treated as nulls; when columns do not allow null values, these values will be converted to a "zero-value" for the corresponding Java representation (e.g., the epoch day or timestamp). A new test case verifies the behaviors.	2016-09-21 09:23:12 -05:00
Randall Hauch	40c1398a95	DBZ-122 Prevent logging of password configuration property values Anytime we `toString()` a `Configuration`, any values for password properties should be masked. A password property is defined to be a property whose key ends in "password" in a case-insensitive manner.	2016-09-15 15:20:55 -05:00
Randall Hauch	330a27ce52	Merge pull request #97 from rhauch/dbz-102 DBZ-102 MySQL connector support for column charsets	2016-08-29 15:12:24 -05:00
Randall Hauch	cc94bbc697	DBZ-102 MySQL connector now processes character sets The MySQL binlog events contain the binary representation of string-like values as encoded per the column's character set. Properly decoding these into Java strings requires capturing the column, table, and database character set when parsing the DDL statements. Unfortunately, MySQL DDL allows columns (at the time the columns are created or modified) to inherit the default character set for the table, or if that is not defined the default character set for the database, or if that is not defined the character set for the server. So, in addition to modifying the MySQL DDL parser to support capturing the character set name for each column, it also had to be changed to know what these default character set names are. The default character sets are all available via MySQL server/session/local variables. Although strictly speaking the character set variables cannot be set globally, MySQL DDL does allow session and local variables to be set with `SET` statements. Therefore, this commit enhances the MySQL DDL parser to parse `SET` statements and to track the various global, session, and local variables as seen by the DDL parser. Upon connector startup, a subset of server variables (related to character sets and collations) are read from the database via JDBC and used to initialize the DDL parser via `SET` methods. In addition to initializing the DDL parser with the system variables related to character sets and collation, it is important to also capture the server and database default character sets in the database history so that the correct character sets are used for columns even when the default character sets have changed on the database and/or the server. Therefore, upon startup or snapshot the MySQL connector records in the database history a `SET` statement for the `character_set_server` and `collation_server` system variables so that, upon a later restart, the history's DDL statements can be re-parsed with the correct default server and database character sets. Also, when the MySQL connector reloads the database history (upon startup), the recorded default server character set is compared with the MySQL instance's current server character set, and if they are different the current character set is recorded with a new `SET` statement. These extra steps ensure that the connector use the correct character set for each column, even when the connector restarts and reloads the database history captured by a previous version of the connector. IOW, the MySQL connector can be safely upgraded, and the new version will correctly start using the columns' character sets to decode the string-like values.	2016-08-29 12:19:24 -05:00
Randall Hauch	257e81c540	DBZ-102 MySQL in-memory models of tables capture column character sets The DDL parser and in-memory models of the relational schemas were changed to capture the character set for each column whose type is a string (e.g., `CHAR`, `VARCHAR`, etc.). This required handling `SET` statements used to change the system variables that hold the names of the default character set for the server and for each database. So, even if a column does not explicitly define the character set, the column's actual character set is identified from the table's character set, which might default to the current database's character set, which if not set defaults to the system character set. These changes merely affect how MySQL DDL is parsed and the in-memory relational schema representation to accommodate the character set at various levels. It does not change the behavior of the MySQL connector; that will be done in a subsequent commit. All tests pass with these changes, including quite a few additional tests for the new functionality.	2016-08-29 11:50:51 -05:00
Randall Hauch	638b459484	DBZ-108 Removed the TimeZoneAdapter and test, which is no longer used	2016-08-24 16:31:35 -05:00
Randall Hauch	4de56fd657	Merge pull request #94 from hchiorean/DZB-header-fix Fixes the DBZ header required by checkstyle	2016-08-24 14:28:43 -05:00
Randall Hauch	ce2b2db80c	DBZ-99 Added support for MySQL connector to connect securely to MySQL Changed the MySQL connector to have several new configuration properties for setting up the SSL key store and trust store (which can be used in place of System or JDK properties) used for MySQL secure connections, and another property to specify what kind of SSL connection be used. Modified several integration tests to ensure all MySQL connections are made with `useSSL=false`.	2016-08-24 13:27:35 -05:00
Horia Chiorean	2732d26ff0	Fixes the DBZ header required by checkstyle This commit removes an extra space character from the first blank line of the header	2016-08-24 13:41:15 +03:00
Randall Hauch	448d514c81	DBZ-106 Corrected the MySQL DDL parser to properly handled quoted keywords as column names.	2016-08-23 17:03:53 -05:00
Randall Hauch	e86fb83459	[maven-release-plugin] prepare for next development iteration	2016-08-16 09:56:47 -05:00
Randall Hauch	ccdb0a1a63	[maven-release-plugin] prepare release v0.3.0	2016-08-16 09:56:47 -05:00
Randall Hauch	918a523f12	DBZ-100 Changed the MongoDB connector to use a new JSON semantic type Added a semantic type for JSON strings, and used it in the MongoDB connector.	2016-08-15 12:11:35 -05:00
Randall Hauch	db49f0b17b	DBZ-100 Removed unused IsoTimestamp and IsoTime semantic types	2016-08-15 12:11:35 -05:00
Randall Hauch	d8a5d2b50f	DBZ-100 Corrected MySQL connector's use of ENUM and SET values The ENUM and SET values read from the binlog contain the indexes of the options that are included in the value, but this doesn't compared with the string values returned by MySQL and JDBC that contain the comma-separated options. With this change, the values read from the binlog will also be comma-separated strings.	2016-08-15 12:11:35 -05:00
Randall Hauch	6b591fc9b0	DBZ-91 Added a unit test for temporal conversions Also removed a non-unit-test test.	2016-08-15 10:29:16 -05:00
Randall Hauch	ba553c91e8	DBZ-91 Changed MicroTime to use INT64 There are more microseconds per day than can be represented with INT32, so this was changed to INT64.	2016-08-11 12:09:24 -05:00
Randall Hauch	19fc95fe08	DBZ-91 Simplified the temporal conversion functions to use primitives.	2016-08-11 10:48:38 -05:00
Randall Hauch	629542458e	DBZ-91 Added option to force use Kafka Connect temporal types.	2016-08-11 10:48:07 -05:00
Randall Hauch	31641fb43e	DBZ-91 Changed how temporal values are treated in MySQL connector Rewrote how the MySQL connector converts temporal values to use schemas with names that identify the semantic type of temporal value, and customized how the MySQL binlog client library creates Java object values from the raw binlog events. Several new "semantic" schema types were defined: * `io.debezium.time.Year` represents a year number as an INT32 value (e.g., 2016, -345, etc.). * `io.debezium.time.Date` represents a date by storing the epoch seconds (that is, the number of seconds past the epoch) as an INT64 value. * `io.debezium.time.Time` represents a time by storing the milliseconds past midnight as an INT32 value. * `io.debezium.time.MicroTime` represents a time by storing the microsconds past midnight as an INT32 value. * `io.debezium.time.NanoTime` represents a time by storing the nanoseconds past midnight as an INT32 value. * `io.debezium.time.Timestamp` represents a date and time (without timezone information) by storing the milliseconds past epoch as an INT64 value. * `io.debezium.time.MicroTimestamp` represents a date and time (without timezone information) by storing the microseconds past epoch as an INT64 value. * `io.debezium.time.NanoTimestamp` represents a date and time (without timezone information) by storing the nanoseconds past epoch as an INT64 value. * `io.debezium.time.ZonedTime` represents a time with timezone and optional fractions of a second (but no date) by storing the ISO8601 form as a STRING value (e.g., `10:15:30+01:00`) * `io.debezium.time.ZonedTimestamp` represents a date and time with timezone and optional fractions of a second by storing the ISO8601 form as a STRING value (e.g., `2011-12-03T10:15:30.030431+01:00`) This range of semantic types allows for a far more accurate representation in the events of the temporal values stored within the database. The MySQL connector chooses the semantic type based upon the precision of the MySQL type (e.g., `TIMESTAMP(6)` will be represented with `io.debezium.time.MicroTimestamp`, whereas `TIMESTAMP(3)` will be represented with `io.debezium.time.Timestamp`). This ensures that the events do not lose precision and that the semantics of the database column values are retained in the events even though the values are represented with primitive values. Obviously these Kafka Connect schema representations are different and more precise than the built-in `org.apache.kafka.connect.data.Date`, `org.apache.kafka.connect.data.Time`, and `org.apache.kafka.connect.data.Timestamp` logical types provided by Kafka Connect and used by the MySQL connector in all 0.2.x and 0.1.x versions. Migration to the new MySQL connector should be possible, although consumers may still need to know about these types to properly handle temporal values and the correct precision (i.e., consumers can just assume all date INT64 values represent milliseconds). The MySQL binlog client library converted the raw binary event information to JDBC types using a local Calendar instance, which obviously incorporates the local timezone and cannot retain more than millisecond precision. This change extends the library's deserializers to instead use the Java 8 `javax.time` classes and to retain the exact semantics of the database values and to not lose any precisions (since the `javax.time` classes have nanosecond precision). The same logic is also used to convert the JDBC values obtained during a snapshot from the MySQL Connect/J JDBC driver. The latter has a few quirks, such as not returning any fractional seconds for `TIME` columns, even though `java.sql.Time` can store up to milliseconds. Most of the logic of the conversions of values and mapping to Kafka Connect schemas is handled in the new `JdbcValueConverters`, which was extracted from the existing `TableSchemaBuilder`. The MySQL connector reuses and actually extends the `JdbcValueConverters` class with its own `MySqlValueConverters` class that also adds support for MySQL-specific types such as `YEAR`. Other connectors whose values are based on JDBC types should be able to reuse and/or extend the `JdbcValueConverters` class. Integration tests that deal with temporal types were modified to use proper expected values and comparisons.	2016-08-10 15:51:07 -05:00
Horia Chiorean	ab24f013d1	DBZ-96 Removes some asserts on tables created by another test case	2016-08-08 14:25:38 +03:00
Randall Hauch	2ae26819af	DBZ-94 Added support for copying very large tables during snapshot By default the MySQL JDBC driver will put the entire result set into memory, which obviously doesn't work for tables of even moderate sizes. This change adds support for streaming rows in result sets when the tables have more than a configurable number of rows (defaults to 1,000). This posed a problem for how we were previously finding the last row in the last table; the MySQL driver does not support `ResultSet.isLast()` on result sets that are streamed. Instead, this commit wraps the consumer to which the snapshot reader writes all source records, with a consumer that buffers the last record. When the snapshot completes, the offset is updated (denoting the end of the snapshot) and set on the last buffered record before that record is flushed to the normal consumer. This should add minimal overhead while simplifying the logic to ensure the last source record has the updated offset. This also improves the log output of the snapshot process.	2016-08-04 16:06:50 -05:00
Horia Chiorean	bb1b7d5734	DBZ-92 Adds more logging information during MySQL snapshot recreation	2016-08-03 16:54:17 +03:00
Randall Hauch	8cb39eacf0	Reverted back to 0.3.0-SNAPSHOT, since the 0.3 candidate release was not acceptable.	2016-08-01 12:25:58 -05:00
Horia Chiorean	a6dddaed92	Fixes a couple of test related issues for debezium-core * fixes a java.sql.Date conversion test to take into account zone offsets * makes sure the ZK DB is closed during testing, otherwise file handles may leak and cause test failures	2016-07-26 14:17:31 +03:00
Randall Hauch	517272278d	[maven-release-plugin] prepare for next development iteration	2016-07-25 17:50:31 -05:00
Randall Hauch	b89296e646	[maven-release-plugin] prepare release v0.3.0	2016-07-25 17:50:31 -05:00
Randall Hauch	a8fa33e44b	DBZ-85 Corrected log statements to be debug	2016-07-25 16:59:46 -05:00
Randall Hauch	447acb797d	DBZ-62 Upgraded to Kafka and Kafka Connect 0.10.0.0 Upgraded from Kafka 0.9.0.1 to Kafka 0.10.0. The only required change was to override the `Connector.config()` method, which returns `null` or a `ConfigDef` instance that contains detailed metadata for each of the configuration fields, including supporting recommended values and marking fields as not visible (e.g., if they don't make sense given other configuration field values). This can be used by user interfaces to data-drive the configuration of a connector. Also, the default validation logic of the Connector implementations uses a `Validator` that is pretty restrictive in its functionality. Debezium already had a fairly decent and simple `Configuration` framework. After several attempts to try and merge these concepts, reconciling the two validation mechanisms was very complicated and involved a lot of changes. It was easier to simply continue Debezium-specific validation and to override the `Connector.validate(...)` method to use Debezium's `Configuration`-based validation. Connector-based validation logic includes determining recommended values, so Debezium's `Field` class (used to define each configuration property) was enhanced with a new `Recommender` class that is similar to Kafka's. Additional integration tests were added to verify that the `ConfigDef` result is acceptable and that the new connector validation logic works as expected, including getting recommended values for some fields (e.g., database names, table/collection names) from MySQL and MongoDB by connecting and dynamically reading the values. This was done in a way that remains backward compatible with the regular expression formats of these fields, but in a user interface that uses the `ConfigDef` mechanism the user can simply select the databases and table/collection identifiers.	2016-07-25 14:21:31 -05:00
Randall Hauch	30777e3345	DBZ-85 Added test case and made correction to temporal values Added an integration test case to diagnose the loss of the fractional seconds from MySQL temporal values. The problem appears to be a bug in the MySQL Binary Log Connector library that we used, and this bug was reported as https://github.com/shyiko/mysql-binlog-connector-java/issues/103. That was fixed in version 0.3.2 of the library, which Stanley was kind enough to release for us. During testing, though, several issues were discovered in how temporal values are handled and converted from the MySQL events, through the MySQL Binary Log client library, and through the Debezium MySQL connector to conform with Kafka Connect's various temporal logical schema types. Most of the issues involved converting most of the temporal values from local time zone (which is how they are created by the MySQL Binary Log client) into UTC (which is how Kafka Connect expects them). Really, java.util.Date doesn't have time zone information and instead tracks the number of milliseconds past epoch, but the conversion of normal timestamp information to the milliseconds past epoch in UTC depends on the time zone in which that conversion happens.	2016-07-20 17:07:56 -05:00
Randall Hauch	a5f4d0bf31	DBZ-87 Changed mapping of MySQL TINYINT and SMALLINT columns from INT32 to INT16 The MySQL connector now maps TINYINT and SMALLINT columns to INT16 (rather than INT32) because INT16 is smaller and yet still large enough for all TINYINT and SMALLINT values. Note that the range of TINYINT values is either -128 to 127 for signed or 0 to 255 for unsigned, and thus INT8 is not an acceptable choice since it can only handle values in the range 0 to 255. Additionally, the JDBC Specification also suggests the proper Java type for SQL-99's TINYINT is short, which maps to Kafka Connect's INT16. This change will be backward compatible, although the generated Kafka Connect schema will be different than in previous versions. This shouldn't cause a problem, since clients should expect to handle schema changes, and this schema change does comply with Avro schema evolution rules.	2016-07-19 11:11:05 -05:00
Randall Hauch	04eef2da5c	DBZ-84 Tried to replicate error with MySQL TINYINT columns Tried unsuccessfully to replicate the problem reported in DBZ-84 with a new regression integration test.	2016-07-19 10:58:28 -05:00
Randall Hauch	a88bcb9ae7	DBZ-86 Generated Kafka Schema names will now also be valid Avro fullnames	2016-07-15 16:29:52 -05:00
Randall Hauch	12e7cfb8d3	DBZ-2 Created initial Maven module with a MongoDB connector Added a new `debezium-connector-mongodb` module that defines a MongoDB connector. The MongoDB connector can capture and record the changes within a MongoDB replica set, or when seeded with addresses of the configuration server of a MongoDB sharded cluster, the connector captures the changes from the each replica set used as a shard. In the latter case, the connector even discovers the addition of or removal of shards. The connector monitors each replica set using multiple tasks and, if needed, separate threads within each task. When a replica set is being monitored for the first time, the connector will perform an "initial sync" of that replica set's databases and collections. Once the initial sync has completed, the connector will then begin tailing the oplog of the replica set, starting at the exact point in time at which it started the initial sync. This equivalent to how MongoDB replication works. The connector always uses the replica set's primary node to tail the oplog. If the replica set undergoes an election and different node becomes primary, the connector will immediately stop tailing the oplog, connect to the new primary, and start tailing the oplog using the new primary node. Likewise, if connector experiences any problems communicating with the replica set members, it will try to reconnect (using exponential backoff so as to not overwhelm the replica set) and continue tailing the oplog from where it last left off. In this way the connector is able to dynamically adjust to changes in replica set membership and to automatically handle communication failures. The MongoDB oplog contains limited information, and in particular the events describing updates and deletes do not actually have the before or after state of the documents. Instead, the oplog events are all idempotent, so updates contain the effective changes that were made during an update, and deletes merely contain the deleted document identifier. Consequently, the connector is limited in the information it includes in its output events. Create and read events do contain the initial state, but the update contain only the changes (rather than the before and/or after states of the document) and delete events do not have the before state of the deleted document. All connector events, however, do contain the local system timestamp at which the event was processed and _source_ information detailing the origins of the event, including the replica set name, the MongoDB transaction timestamp of the event, and the transactions identifier among other things. It is possible for MongoDB to lose commits in specific failure situations. For exmaple, if the primary applies a change and records it in its oplog before it then crashes unexpectedly, the secondary nodes may not have had a chance to read those changes from the primary's oplog before the primary crashed. If one such secondary is then elected as primary, it's oplog is missing the last changes that the old primary had recorded and no longer has those changes. In these cases where MongoDB loses changes recorded in a primary's oplog, it is possible that the MongoDB connector may or may not capture these lost changes.	2016-07-14 13:02:36 -05:00
Randall Hauch	6749518f66	[maven-release-plugin] prepare for next development iteration	2016-06-08 13:00:50 -05:00
Randall Hauch	d5bbb116ed	[maven-release-plugin] prepare release v0.2.0	2016-06-08 13:00:50 -05:00
Randall Hauch	cf26a5c4e0	Removed duplicate versions in POMs	2016-06-08 09:46:05 -05:00
Randall Hauch	a143871abd	DBZ-61 Improved MySQL connector's handling of binary values Binary values read from the MySQL binlog may include strings, in which case they need to be converted to binary values. Interestingly, work on this uncovered [KAFKA-3803](https://issues.apache.org/jira/browse/KAFKA-3803) whereby Kafka Connect's `Struct.equals` method does not properly handle comparing `byte[]` values. Upon researching the problem and potentially supplying a patch, it was discovered that the Kafka Connect codebase and the Avro converter all use `ByteBuffer` objects rather than `byte[]`. Consequently, the Debezium code that converts JDBC values to Kafka Connect values was changed to return `ByteBuffer` objects rather than `byte[]` objects. Unfortunately, the JSON converter rehydrates objects with just `byte[]`, so that still means that Debezium's `VerifyRecords` logic cannot rely upon `Struct.equals` for comparison, and instead needs custom logic.	2016-06-07 17:53:07 -05:00
Randall Hauch	f48d48e114	DBZ-37 Added integration test with MySQL GTIDs Added a Maven profile to the MySQL connector component with a Docker image that runs MySQL with GTIDs enabled. The same integration tests can be run with it using `-Pgtid-mysql` or `-Dgtid-mysql` in the Maven build. When the MySQL connector starts up, it now queries the MySQL server to detect whether GTIDs are enabled, and if they are it will also verify that any GTID sets from the most recently recorded offset are still available in the MySQL server (similarly to how it was already doing this for binlog filenames). If the server does not have the correct coordinates/GTIDs, the connector fails with a useful error message. This commit also tests and adjusts the `GtidSet` class to better deal with comparisons of GTID sets for proper ordering. It also changes the connector to output MySQL's timestamp for each event using _second_ precision rather than artificially in _millisecond_ precision. To clarify the different, this change renames the field in the event's `source` structure that records the MySQL timestamp from `ts` to `ts_sec`. Similarly, the envelope's field that records the time that the connector processed each record was renamed from `ts` to `ts_ms`. All unit and integration tests pass with the default profile and with the new GTID-enabled profile.	2016-06-07 12:01:51 -05:00
Randall Hauch	e91aac5b18	DBZ-37 DatabaseHistory can now use custom logic to compare offsets DatabaseHistory stores the DDL changes with the offset describing the position in the source where those DDL statements were found. When a connector restarts at a specific offset (supplied by Kafka Connect), connectors such as the MySQL connector reconstruct the database schemas by having DatabaseHistory load the history starting from the beginning and stopping at (or just before) the connector's starting offset. This change allows connectors to supply a custom comparison function. To support GTIDs, the MySQL connector needed to store additional information in the offsets. This means the logic needed to compare offsets with and without GTIDs is non-trivial and unique to the MySQL connector. This commit adds a custom comparison function for offsets. Per [MySQL documentation](https://dev.mysql.com/doc/refman/5.7/en/replication-gtids-failover.html), slaves are always expected to start with the same set of GTIDs as the master, so no matter which the MySQL connector follows it should always have the complete set of GTIDs seen by that server. Therefore: * Two offsets with GTID sets can be compared using only the GTID sets. * Any offset with a GTID set is always assumed to be newer than an offset without, since it is assumed once GTIDs are enabled they will remain enabled. (Otherwise, the connector likely needs to be restarted with a snapshot and tied to a specific master or slave with no failover.) * Two offsets without GTIDs are compared using the binlog coordinates (filename, position, and row number). * An offsets that is identical to another except for being in snapshot mode is considered earlier than without the snapshot. This is because snapshot mode begins by recording the position of the snapshot, and once complete the offset is recorded without the snapshot flag.	2016-06-04 16:20:26 -05:00
Randall Hauch	655aac7d4f	DBZ-37 Added support for MySQL GTIDs The BinlogClient library our MySQL connector uses already has support for GTIDs. This change makes use of that and adds the GTIDs from the server to the offsets created by the connector and used upon restarts.	2016-06-02 18:30:26 -05:00
Randall Hauch	264a9041df	DBZ-64 Added Avro Converter to record verification utilities The `VerifyRecord` utility class has methods that will verify a `SourceRecord`, and is used in many of our integration tests to check whether records are constructed in a valid manner. The utility already checks whether the records can be serialized and deserialized using the JSON converter (provided with Kafka Connect); this change also checks with the Avro Converter (which produces much smaller records and is more suitable for production). Note that version 3.0.0 of the Confluent Avro Converter is required; version 2.1.0-alpha1 could not properly handle complex Schema objects with optional fields (see https://github.com/confluentinc/schema-registry/pull/280). Also, the names of the Kafka Connect schemas used in MySQL source records has changed. # The record's envelope Schema used to be "<serverName>.<database>.<table>" but is now "<serverName>.<database>.<table>.Envelope". # The Schema for record keys used to be named "<database>.<table>/pk", but the '/' character is not valid within a Avro name, and has been changed to "<serverName>.<database>.<table>.Key". # The Schema for record values used to be named "<database>.<table>", but to better fit with the other Schema names it has been changed to "<serverName>.<database>.<table>.Value". Thus, all of the Schemas for a single database table have the same Avro namespace "<serverName>.<database>.<table>" (or "<topicName>") with Avro schema names of "Envelope", "Key", and "Value". All unit and integration tests pass.	2016-06-02 16:54:21 -05:00
Randall Hauch	46c0ce9882	DBZ-58 Added MDC logging contexts to connector Changed the MySQL connector to make use of MDC logging contexts, which allow thread-specific parameters that can be written out on every log line by simply changing the logging configuration (e.g., Log4J configuration file). We adopt a convention for all Debezium connectors with the following MDC properties: * `dbz.connectorType` - the type of connector, which would be a single well-known value for each connector (e.g., "MySQL" for the MySQL connector) * `dbz.connectorName` - the name of the connector, which for the MySQL connector is simply the value of the `server.name` property (e.g., the logical name for the MySQL server/cluster). Unfortunately, Kafka Connect does not give us its name for the connector. * `dbz.connectorContext` - the name of the thread, which is "main" for thread running the connector; the MySQL connector uses "snapshot" for the thread started by the snapshot reader, and "binlog" for the thread started by the binlog reader. Different logging frameworks have their own way of using MDC properties. In a Log4J configuration, for example, simply use `%X{name}` in the logger's layout, where "name" is one of the properties listed above (or another MDC property).	2016-06-02 14:05:06 -05:00
Randall Hauch	58a5d8c033	DBZ-31 Added support for possibly performing snapshot upon startup Refactored the MySQL connector to break out the logic of reading the binlog into a separate class, added a similar class to read a full snapshot, and then updated the MySQL connector task class to use both. Added several test cases and updated the existing tests.	2016-06-01 21:40:53 -05:00
Randall Hauch	e6c0ff5e4d	DBZ-31 Refactored the MySQL Connector Several of the MySQL connector classes were fairly large and complicated, and to prepare for upcoming changes/enhancements these larger classes were refactored to pull out units of functionality. Currently all unit tests pass with these changes, with additional unit tests for these new components.	2016-05-26 15:58:58 -05:00
Randall Hauch	24e99fb28f	DBZ-31 DDL parser now supports '#' as comment line prefix	2016-05-26 15:40:50 -05:00
Randall Hauch	dc5a379764	DBZ-55 Corrected filtering of DDL statements based upon affected database Previously, the DDL statements were being filtered and recorded based upon the name of the database that appeared in the binlog. However, that database name is actually the name of the database to which the client submitting the operation is connected, and is not necessarily the database _affected_ by the operation (e.g., when an operation includes a fully-qualified table name not in the connected-to database). With these changes, the table/database affected by the DDL statements is now being used to filter the recording of the statements. The order of the DDL statements is still maintained, but since each DDL statement can apply to a separate database the DDL statements are batched (in the same original order) based upon the affected database. For example, two statements affecting "db1" will get batched together into one schema change record, followed by one statement affecting "db2" as a second schema change record, followed by another statement affecting "db1" as a third schema record. Meanwhile, this change does not affect how the database history records the changes: it still records them as submitted using a single record for each separate binlog event/position. This is much safer as each binlog event (with specific position) is written atomically to the history stream. Also, since the database history stream is what the connector uses upon recovery, the database history records are now written _after_ any schema change records to ensure that, upon recovery after failure, no schema change records are lost (and instead have at-least-once delivery guarantees).	2016-05-23 11:01:27 -05:00
Randall Hauch	07315f2b4b	DBZ-43 Changed form of schema change topic to use schemas	2016-05-19 16:54:22 -05:00
Randall Hauch	c0b7114424	DBZ-52 Added top-level container structure to all messages The new envelope Struct contains fields for the local time at which the connector processed the event, the kind of operation (e.g., read, insert, update, or delete), the state of the record before and after the change, and the information about the event source. The latter two items are connector-specific. The timestamp is merely the time using the connector's process clock, and no guarantees are provided about accuracy, monotonicity, or relationship to the original source event. The envelope structure is now used as the value for each event message in the MySQL connector; they keys of the event messages remain unchanged. Note that to facilitate Kafka log compaction (which requires a null value), a delete event containing the envelope with details about the deletion is followed by a "tombstone" event that contains the same key but null value. An example of a message value with this new envelope is as follows: { "schema" : { "type" : "struct", "fields" : [ { "type" : "struct", "fields" : [ { "type" : "int32", "optional" : false, "name" : "org.apache.kafka.connect.data.Date", "version" : 1, "field" : "order_date" }, { "type" : "int32", "optional" : false, "field" : "purchaser" }, { "type" : "int32", "optional" : false, "field" : "quantity" }, { "type" : "int32", "optional" : false, "field" : "product_id" } ], "optional" : true, "name" : "connector_test.orders", "field" : "before" }, { "type" : "struct", "fields" : [ { "type" : "int32", "optional" : false, "name" : "org.apache.kafka.connect.data.Date", "version" : 1, "field" : "order_date" }, { "type" : "int32", "optional" : false, "field" : "purchaser" }, { "type" : "int32", "optional" : false, "field" : "quantity" }, { "type" : "int32", "optional" : false, "field" : "product_id" } ], "optional" : true, "name" : "connector_test.orders", "field" : "after" }, { "type" : "struct", "fields" : [ { "type" : "string", "optional" : false, "field" : "server" }, { "type" : "string", "optional" : false, "field" : "file" }, { "type" : "int64", "optional" : false, "field" : "pos" }, { "type" : "int32", "optional" : false, "field" : "row" } ], "optional" : false, "name" : "io.debezium.connector.mysql.Source", "field" : "source" }, { "type" : "string", "optional" : false, "field" : "op" }, { "type" : "int64", "optional" : true, "field" : "ts" } ], "optional" : false, "name" : "kafka-connect-2.connector_test.orders", "version" : 1 }, "payload" : { "before" : null, "after" : { "order_date" : 16852, "purchaser" : 1003, "quantity" : 1, "product_id" : 107 }, "source" : { "server" : "kafka-connect-2", "file" : "mysql-bin.000002", "pos" : 2887680, "row" : 4 }, "op" : "c", "ts" : 1463437199134 } } Notice how the Schema is significantly larger, since it must describe all of the envelope's fields even when those fields are not used. In this case, the event signifies that a record was created as the 4th record of a single event recorded in the binlog.	2016-05-19 12:40:16 -05:00
Randall Hauch	6d56a8f3d0	DBZ-50 Added parameters for truncated length and when the field is masked.	2016-05-12 16:31:33 -05:00
Randall Hauch	b1e6eb1028	DBZ-29 Refactored ColumnMappers and enabled ColumnMapper impls to add parameters to the Kafka Connect Schema.	2016-05-12 12:26:04 -05:00
Randall Hauch	18995abfbd	Merge pull request #38 from rhauch/dbz-29 DBZ-29 Changed MySQL connector to be able to hide, truncate, and mask specific columns	2016-05-12 08:27:15 -05:00
Randall Hauch	ff9d0fc240	DBZ-29 Changed MySQL connector to be able to hide, truncate, and mask specific columns Changed the MySQL connector to use comma-separated lists of regular expressions for the database and table whitelist/blacklists. Literals are still accepted and will match fully-qualified table names, although the '.' character used as a delimiter is also a special character in regular expressions and therefore may need to be escaped with a double backslash ('\\') to more carefully match fully-qualified table names. Added several new configuration properties for the MySQL connector that instruct it to hide, truncate, and/or mask certain columns. The properties' values are all lists of regular expressions or literal fully-qualified column names. For example, the following configuration property: column.blacklist=server.users.picture,server.users.other will cause the connector to leave out of change event messages for the `server.users` table those fields that correspond to the `picture` and `others` columns. This capability can be used to This capability can be used to prevent dissemination of sensitive information in the change event stream. An alternative to blacklisting is masking. The following configuration property: column.mask.with.10.chars=server\\.users\\.(\\wemail) will cause the connector to mask in the change event messages for the `server.users` table all values for columns whose name ends in `email`. The values will be replaced in this case with a constant string of 10 asterisk ('') characters, even when the email value is null. This capability can also be used to prevent dissemination of sensitive information in the change event stream. Another option is to truncate string values for specific columns. The following configuration property: column.truncate.to.120.chars=server[.]users[.](description\|biography) will cause the connector to truncate to at most 120 characters the values of the `description` and `biography` columns in the change event messages for the `server.users` table. Although this example used a limit of 120 characters, any positive length can be specified; separate properties should be used when different lengths are required. Note how the '.' delimiter in the fully-qualified names is escaped since that same character is a special character in regular expressions. This capability can be used to reduce the size of change event messages.	2016-05-11 15:57:06 -05:00
Christian Posta	8b736ef654	DBZ-48 Cannot parse COMMIT and flush statements	2016-05-05 15:36:24 -07:00
Randall Hauch	1fcb4b02cf	DBZ-38 Changed DROP VIEW and TABLE to include single-table statements in events Drop table/view statements that involve more than one table generate one event for each table/view. Previously, each of those statements had the original multi-table/view statement. Now, each event has a statement that applies to only that table (generated from the original with all the same clauses).	2016-04-12 18:18:13 -05:00
Randall Hauch	b1e428c986	DBZ-38 Adjusted how events are generated for RENAME TO statements The previous change did not correctly capture the statements for a `RENAME TO` that renamed multiple tables, so fixed the code so that it generates a single `RENAME TO` for each table rename.	2016-04-12 17:58:07 -05:00
Randall Hauch	5b30568650	DBZ-38 Changed the listening framework of the DDL parser Refactored the mechanism by which components can listen to the activities of a DDL parser. The new approach should be significantly more flexible for additional types of DDL events while making it easier to maintain backward compatibility. It also will enable passing event-specific information on each DDL event.	2016-04-12 11:00:02 -05:00
Randall Hauch	137b9f6d4d	DBZ-38 Changed the DDL parser framework to notify listeners as statements are applied.	2016-04-11 15:16:04 -05:00
Randall Hauch	8f5487b2c0	[maven-release-plugin] prepare for next development iteration	2016-03-17 16:28:40 -05:00
Randall Hauch	c2b8ac50ae	[maven-release-plugin] prepare release v0.1.0	2016-03-17 16:28:40 -05:00
Randall Hauch	43f79aad5e	Added missing version element to modules	2016-03-17 16:14:17 -05:00
Randall Hauch	5a002dbf62	DBZ-15 Cached converters are now dropped upon log rotation.	2016-03-17 11:03:28 -05:00
Randall Hauch	4998325de7	DBZ-30 Changed the MySQL connector to include all columns in the record value	2016-03-04 10:51:14 -06:00
Randall Hauch	9034e26d1e	DBZ-26 Corrected the embedded connector framework to enable stopping. Also improved logging statements.	2016-03-03 15:27:11 -06:00
Randall Hauch	1d46e59048	DBZ-17 Minor changes to the POMs	2016-02-18 13:58:29 -06:00
Randall Hauch	2e5dfd837b	DBZ-13 Minor code changes to eliminate JavaDoc warnings	2016-02-17 11:15:21 -06:00
Randall Hauch	73f3c9836b	DBZ-1 Completed integration testing and debugging of the MySQL connector	2016-02-15 14:46:12 -06:00
Randall Hauch	1a59f9b07c	DBZ-11 Build can skip long-running unit and integration tests	2016-02-04 15:35:27 -06:00
Randall Hauch	54b822bb72	DBZ-10 Added small utility so unit tests can run an embedded Kafka cluster within the same process. This utility is only suitable for unit tests and therefore is defined in the test JAR of the `debezium-core` module. It certainly should never be used for production purposes.	2016-02-04 15:18:27 -06:00
Randall Hauch	c501f8486f	DBZ-9 Added MySQL whitelist and blacklists on tables and databases.	2016-02-04 07:56:13 -06:00
Randall Hauch	37d6a5e7da	DBZ-1 Expanded documentation and improved EmbeddedConnector framework Changed the EmbeddedConnector framework to initialize all major components via configuration properties rather than through the public builder. This increases the size of the configurations, but it simplifies what embedding applications must do to obtain an EmbeddedConnector instance. The DatabaseHistory framework was also changed to be configurable in similar ways to the OffsetBackingStore. Essentially, connectors that want to use it (like the MySqlConnector) will describe it as part of the connector's configuration, allowing more flexibility in which DatabaseHistory implementation is used and how it is configured whether in Kafka Connector or as part of the EmbeddedConnector. Added a README.md to `debezium-embedded` to provide documentation and sample code showing how to use the EmbeddedConnector.	2016-02-03 14:11:53 -06:00
Randall Hauch	2da5b37f76	DBZ-1 Added support for recording and recovering database schema Adds a small framework for recording the DDL operations on the schema state (e.g., Tables) as they are read and applied from the log, and when restarting the connector task to recover the accumulated schema state. Where and how the DDL operations are recorded is an abstraction called `DatabaseHistory`, with three options: in-memory (primarily for testing purposes), file-based (for embedded cases and perhaps standalone Kafka Connect uses), and Kafka (for normal Kafka Connect deployments). The `DatabaseHistory` interface methods take several parameters that are used to construct a `SourceRecord`. The `SourceRecord` type was not used, however, since that would result in this interface (and potential extension mechanism) having a dependency on and exposing the Kafka API. Instead, the more general parameters are used to keep the API simple. The `FileDatabaseHistory` and `MemoryDatabaseHistory` implementations are both fairly simple, but the `FileDatabaseHistory` relies upon representing each recorded change as a JSON document. This is simple, is easily written to files, allows for recovery of data from the raw file, etc. Although this was done initially using Jackson, the code to read and write the JSON documents required a lot of boilerplate. Instead, the `Document` framework developed during Debezium's very early prototype stages was brought back. It provides a very usable API for working with documents, including the ability to compare documents semantically (e.g., numeric values are converted to be able to compare their numeric values rather than just compare representations) and with or without field order. The `KafkaDatabaseHistory` is a bit more complicated, since it uses a Kafka broker to record all database schema changes on a single topic with single partition, and then upon restart uses it to recover the history from the dedicated topics. This implementation also records the changes as JSON documents, keeping it simple and independent of the Kafka Connect converters.	2016-02-02 14:27:14 -06:00
Randall Hauch	6796fe32be	DBZ-1 Added the initial stages of a MySQL source connector The connector is in a basic working state, although it is not well tested yet and upon restart does not recover the schema state from the previous run.	2016-01-29 10:12:28 -06:00
Randall Hauch	d9090ed67b	DBZ-4 Removed unused files, most of which were originally copied from the ModeShape codebase.	2016-01-27 08:37:23 -06:00
Randall Hauch	4c538d4e54	DBZ-4 Changed copyright statement in source code headers and adjusted checkstyle rules.	2016-01-27 08:12:01 -06:00
Randall Hauch	eff1f665fa	Updated checkstyle rule for headers, and corrected several incorrect headers.	2016-01-25 18:59:25 -06:00
Randall Hauch	a0a8953d2a	Updated the copyright dates per new approach.	2016-01-25 18:33:08 -06:00
Randall Hauch	4ddd4b33be	Changed Docker usage on Travis-CI	2016-01-25 16:12:07 -06:00
Randall Hauch	772977f391	Attempted to correct Travis build error	2016-01-25 13:50:17 -06:00
Randall Hauch	5e4c428285	Correct return type for function	2016-01-25 13:41:38 -06:00
Randall Hauch	71e90b5a69	Added MySQL ingest module with support for reading DDL statements.	2016-01-23 08:26:52 -06:00
Randall Hauch	8e6c615644	Added utilities for managing a relational schema's table definitions, with support for updating those by reading DDL	2016-01-20 08:53:29 -06:00
Randall Hauch	dffdfd8049	Added debezium-core and MySQL binary log reading tests.	2015-11-24 15:54:37 -06:00

... 30 31 32 33 34 ...

2066 Commits