tet123/debezium-connector-mongodb/NOTES.md

50 lines
2.4 KiB
Markdown
Raw Normal View History

DBZ-2 Created initial Maven module with a MongoDB connector Added a new `debezium-connector-mongodb` module that defines a MongoDB connector. The MongoDB connector can capture and record the changes within a MongoDB replica set, or when seeded with addresses of the configuration server of a MongoDB sharded cluster, the connector captures the changes from the each replica set used as a shard. In the latter case, the connector even discovers the addition of or removal of shards. The connector monitors each replica set using multiple tasks and, if needed, separate threads within each task. When a replica set is being monitored for the first time, the connector will perform an "initial sync" of that replica set's databases and collections. Once the initial sync has completed, the connector will then begin tailing the oplog of the replica set, starting at the exact point in time at which it started the initial sync. This equivalent to how MongoDB replication works. The connector always uses the replica set's primary node to tail the oplog. If the replica set undergoes an election and different node becomes primary, the connector will immediately stop tailing the oplog, connect to the new primary, and start tailing the oplog using the new primary node. Likewise, if connector experiences any problems communicating with the replica set members, it will try to reconnect (using exponential backoff so as to not overwhelm the replica set) and continue tailing the oplog from where it last left off. In this way the connector is able to dynamically adjust to changes in replica set membership and to automatically handle communication failures. The MongoDB oplog contains limited information, and in particular the events describing updates and deletes do not actually have the before or after state of the documents. Instead, the oplog events are all idempotent, so updates contain the effective changes that were made during an update, and deletes merely contain the deleted document identifier. Consequently, the connector is limited in the information it includes in its output events. Create and read events do contain the initial state, but the update contain only the changes (rather than the before and/or after states of the document) and delete events do not have the before state of the deleted document. All connector events, however, do contain the local system timestamp at which the event was processed and _source_ information detailing the origins of the event, including the replica set name, the MongoDB transaction timestamp of the event, and the transactions identifier among other things. It is possible for MongoDB to lose commits in specific failure situations. For exmaple, if the primary applies a change and records it in its oplog before it then crashes unexpectedly, the secondary nodes may not have had a chance to read those changes from the primary's oplog before the primary crashed. If one such secondary is then elected as primary, it's oplog is missing the last changes that the old primary had recorded and no longer has those changes. In these cases where MongoDB loses changes recorded in a primary's oplog, it is possible that the MongoDB connector may or may not capture these lost changes.
2016-04-19 22:49:58 +02:00
This module builds and runs two containers based upon the [mongo:3.2](https://hub.docker.com/_/mongo/) Docker image. The first _primary_ container starts MongoDB, while the second _initiator_ container initializes the replica set and then terminates.
## Using MongoDB
As mentioned in the [README.md]() file, our Maven build can be used to start a container using either one of these images. The `mongo:3.2` image is used:
$ mvn docker:start
The command leaves the primary container running so that you can use the running MySQL server. For example, you can establish a `bash` shell inside the container (named `mongo1`) by using Docker in another terminal:
$ docker exec -it mongo1 bash
Or you can run integration tests from your IDE, as described in detail in the [README.md]() file.
To stop and remove the `mongo1` container, simply use the following Maven command:
$ mvn docker:stop
or use the following Docker commands:
$ docker stop mongo1
$ docker rm mongo1
## Using Docker directly
Although using the Maven command is far simpler, the Maven commands really just run for the `alt-server` profile really just runs (via the Jolokia Maven plugin) a Docker command to start the container, so it's equivalent to:
$ docker run -it --rm --name mongo mongo:latest --replSet rs0 --oplogSize=2 --enableMajorityReadConcern
This will use the `mongo:3.2` image to start a new container named `mongo`. This can be repeated multiple times to start multiple MongoDB secondary nodes:
$ docker run -it --rm --name mongo1 mongo:latest --replSet rs0 --oplogSize=2 --enableMajorityReadConcern
$ docker run -it --rm --name mongo2 mongo:latest --replSet rs0 --oplogSize=2 --enableMajorityReadConcern
Then, run the initiator container to initialize the replica set by assigning the `mongo` container as primary and the other containers as secondary nodes:
$ docker run -it --rm --name mongoinit --link mongo:mongo --link mongo1:mongo1 --link mongo2:mongo2 -e REPLICASET=rs0 -e debezium/mongo-replicaset-initiator:3.2
Once the replica set is initialized, the `mongoinit` container will complete and be removed.
### Use MongoDB client
The following command can be used to manually start up a Docker container to run the MongoDB command line client:
$ docker run -it --link mongo:mongo --rm mongo:3.2 sh -c 'exec mongo "$MONGO_PORT_27017_TCP_ADDR:$MONGO_PORT_27017_TCP_PORT"'
Note that it must be linked to the Mongo container to which it will connect.