Docker / Mongo Issues

Hey folks,

I had this sort of issue in the past already. I am running a docker image with a 3-way real-life (non-docker) mongo cluster. The cluster is alive, online, replicated etc. This setup does work with RocketChat in Docker 4.8.1. Upgrading to 5.x fails with

[root@rc01:~/new-docker] # docker-compose up
WARNING: The DEPLOY_PLATFORM variable is not set. Defaulting to a blank string.
Creating network "new-docker_default" with the default driver
Creating new-docker_rocketchat_1 ... done
Attaching to new-docker_rocketchat_1
rocketchat_1  | /app/bundle/programs/server/node_modules/fibers/future.js:313
rocketchat_1  | 						throw(ex);
rocketchat_1  | 						^
rocketchat_1  | 
rocketchat_1  | MongoServerSelectionError: Server selection timed out after 30000 ms
rocketchat_1  |     at Timeout._onTimeout (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb/lib/sdam/topology.js:312:38)
rocketchat_1  |     at listOnTimeout (internal/timers.js:557:17)
rocketchat_1  |     at processTimers (internal/timers.js:500:7) {
rocketchat_1  |   reason: TopologyDescription {
rocketchat_1  |     type: 'Single',
rocketchat_1  |     servers: Map(3) {
rocketchat_1  |       'rc01.example.com:27017' => ServerDescription {
rocketchat_1  |         _hostAddress: HostAddress {
rocketchat_1  |           isIPv6: false,
rocketchat_1  |           host: 'rc01.example.com',
rocketchat_1  |           port: 27017
rocketchat_1  |         },
rocketchat_1  |         address: 'rc01.example.com:27017',
rocketchat_1  |         type: 'Unknown',
rocketchat_1  |         hosts: [],
rocketchat_1  |         passives: [],
rocketchat_1  |         arbiters: [],
rocketchat_1  |         tags: {},
rocketchat_1  |         minWireVersion: 0,
rocketchat_1  |         maxWireVersion: 0,
rocketchat_1  |         roundTripTime: -1,
rocketchat_1  |         lastUpdateTime: 1166882,
rocketchat_1  |         lastWriteDate: 0
rocketchat_1  |       },
rocketchat_1  |       'rc02.example.com:27017' => ServerDescription {
rocketchat_1  |         _hostAddress: HostAddress {
rocketchat_1  |           isIPv6: false,
rocketchat_1  |           host: 'rc02.example.com',
rocketchat_1  |           port: 27017
rocketchat_1  |         },
rocketchat_1  |         address: 'rc02.example.com:27017',
rocketchat_1  |         type: 'Unknown',
rocketchat_1  |         hosts: [],
rocketchat_1  |         passives: [],
rocketchat_1  |         arbiters: [],
rocketchat_1  |         tags: {},
rocketchat_1  |         minWireVersion: 0,
rocketchat_1  |         maxWireVersion: 0,
rocketchat_1  |         roundTripTime: -1,
rocketchat_1  |         lastUpdateTime: 1166881,
rocketchat_1  |         lastWriteDate: 0
rocketchat_1  |       },
rocketchat_1  |       'rc03.example.com:27017' => ServerDescription {
rocketchat_1  |         _hostAddress: HostAddress {
rocketchat_1  |           isIPv6: false,
rocketchat_1  |           host: 'rc03.example.com',
rocketchat_1  |           port: 27017
rocketchat_1  |         },
rocketchat_1  |         address: 'rc03.example.com:27017',
rocketchat_1  |         type: 'Unknown',
rocketchat_1  |         hosts: [],
rocketchat_1  |         passives: [],
rocketchat_1  |         arbiters: [],
rocketchat_1  |         tags: {},
rocketchat_1  |         minWireVersion: 0,
rocketchat_1  |         maxWireVersion: 0,
rocketchat_1  |         roundTripTime: -1,
rocketchat_1  |         lastUpdateTime: 1166886,
rocketchat_1  |         lastWriteDate: 0
rocketchat_1  |       }
rocketchat_1  |     },
rocketchat_1  |     stale: false,
rocketchat_1  |     compatible: true,
rocketchat_1  |     heartbeatFrequencyMS: 10000,
rocketchat_1  |     localThresholdMS: 15,
rocketchat_1  |     setName: 'rc010',
rocketchat_1  |     logicalSessionTimeoutMinutes: undefined
rocketchat_1  |   }
rocketchat_1  | }

As I am using a real-life mongo cluster, this is my compose.yml:

volumes:
  rocketchat:

services:
  rocketchat:
    image: registry.rocket.chat/rocketchat/rocket.chat:${RELEASE:-latest}
    restart: unless-stopped
    environment:
      MONGO_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/rocketchat?authSource=admin&replicaSet=rc010&w=majority&directConnection=true
      MONGO_OPLOG_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/local?authSource=admin&replicaSet=rc01&directConnection=true
      ROOT_URL: https://chat.example.com
      PORT: ${PORT:-3000}
      DEPLOY_METHOD: docker
      DEPLOY_PLATFORM: ${DEPLOY_PLATFORM}
    expose:
      - ${PORT:-3000}
    ports:
      - "${BIND_IP:-0.0.0.0}:${HOST_PORT:-3000}:${PORT:-3000}"
    volumes:
      - /var/rocketchat:/uploads

The mongo_url worked before; I just added &directConnection=true to the url, to no avail. My .env file:

RELEASE=5.1.2
DOMAIN=chat.example.com

I can confirm Mongo is up an in the version:

mongodb-org-database-tools-extra-5.0.12-1.el8.x86_64
mongodb-org-tools-5.0.12-1.el8.x86_64
mongodb-org-mongos-5.0.12-1.el8.x86_64
mongodb-org-database-5.0.12-1.el8.x86_64
mongodb-mongosh-1.5.4-1.el8.x86_64
mongodb-database-tools-100.6.0-1.x86_64
mongodb-org-server-5.0.12-1.el8.x86_64
mongodb-org-shell-5.0.12-1.el8.x86_64
mongodb-org-5.0.12-1.el8.x86_64

The 3-node cluster of rc01, rc02 and rc03 are both running one instance of rocketchat and one instance of mongo. There are no firewalls between the nodes. Like it said, it does work with 4.8.1, not with 5.x. Switching docker to 4.x makes it work, upgrading to 5.x breaks it.

It must be something trivial. Any help? :slight_smile:

-Chris.

Really no one has an idea? :frowning:

Hi!

Sorry for the delay here, missed this one :grimacing:

We did a comprehensive video here about this change, check it out:

TLDSEE: there were some changes in node mongo driver, and you need to change some confs. As you are using mongo cluster, you can change the replicas or use the MONGODB_ADVERTISED_HOSTNAME variable.

1 Like

Hey,

thanks for your reply :slight_smile:
I added

      MONGODB_ADVERTISED_HOSTNAME: rc01.example.com

to compose.yml, same result. I am fishing with what I should set that to.
It’s great that you took the time to post a 1 1/2 hour extensive video; but time would not be amiss if you post a small “Readme before Update” somewhere. I really appreciate it, tho!

And if you could walk with me just a tad further that’s also appreciated!

Yeah, the idea was to do a smaller video, but there was too much about it, and there is no one solution that fits all, specially when using multiple replicasets, so it’s better to understand the problem on it’s core so you can choose what is the best solution.

try checking what is the members config in your mongo and reconfigure it to point to the correct hostname, instead of localhost, or 127.0.0.1

Tks!

Ahh,

I think that’s what wrong here. I am not running mongo in a docker container, but natively in the guest OS on the docker hosts. Only RocketChat is run inside a docker-container. I assume

MONGODB_ADVERTISED_HOSTNAME

is a variable for a possible mongo-docker container and not rocketchat?

That is right.

or you can change the configuration of the members of your replica set.

but note, this is more a environment/deployment issue than a Rocket.Chat one =\

Also consider that mongo 5.0 has some CPU incompatibilities. For now we are targeting mongo 4.4, but unless you face this CPU incompatibility, you should be fine.

Thanks!

Hey,

replication:
  replSetName: rc01

Thats the entire confirmation from mongod.conf, all the rest is standard. Did you change anything in your mongo-docker-container?

Help. :slight_smile:

The strange part is that Rocket.Chat seems to not be able to reach your MongoDB at

rc01.example.com:27017

Are they on the same network or is there any firewall between them?

What is the content of the members of your replicaset?

Thanks for continuing to help me. So much appreciated!

The rocketchat container 1 is on host 1 and connecting to mongodb on host 1.
The rocketchat container 2 is on host 2 and connecting to mongodb on host 2.
The rocketchat container 3 is on host 2 and connecting to mongodb on host 3.

Well, in a perfect world. Only one mongo is primary, that’s why I am using

MONGO_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/rocketchat?authSource=admin&replicaSet=rc010&w=majority&directConnection=true

In the config. There are no firewalls in between. Like I said, this setup does work with 4.8.1 currently. Upgrading (only the) rocketchat docker container breaks it, going back to 4.8.1 makes it working again. I can also see access from the upgraded docker containers to mongo, so connectivity is not an issue.

It must be something inside docker container for rocketchat 5.x.

Addendum:

Here is the mongod.log lines for the upgraded docker-container.

{"t":{"$date":"2022-09-23T09:17:55.153+02:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn13894","msg":"client metadata","attr":{"remote":"10.100.0.101:43520","client":"conn13894","doc":{"driver":{"name":"nodejs","version":"4.3.1"},"os":{"type":"Linux","name":"linux","architecture":"x64","version":"4.18.0-372.26.1.el8_6.x86_64"},"platform":"Node.js v14.19.3, LE (unified)|Node.js v14.19.3, LE (unified)"}}}
{"t":{"$date":"2022-09-23T09:18:14.615+02:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn13894","msg":"Connection ended","attr":{"remote":"10.100.0.101:43520","uuid":"22136257-990b-46d4-a143-7a8b502d2b95","connectionId":13894,"connectionCount":73}}
{"t":{"$date":"2022-09-23T09:18:14.615+02:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn13893","msg":"Connection ended","attr":{"remote":"10.100.0.101:60100","uuid":"fa7772e0-40b5-44ab-b7fc-49cb417de741","connectionId":13893,"connectionCount":72}}
{"t":{"$date":"2022-09-23T09:18:16.882+02:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.100.0.101:58386","uuid":"182e808a-d530-4e18-9032-e06a5052de1f","connectionId":13895,"connectionCount":73}}
{"t":{"$date":"2022-09-23T09:18:16.893+02:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn13895","msg":"client metadata","attr":{"remote":"10.100.0.101:58386","client":"conn13895","doc":{"driver":{"name":"nodejs","version":"4.3.1"},"os":{"type":"Linux","name":"linux","architecture":"x64","version":"4.18.0-372.26.1.el8_6.x86_64"},"platform":"Node.js v14.19.3, LE (unified)|Node.js v14.19.3, LE (unified)"}}}
{"t":{"$date":"2022-09-23T09:18:27.407+02:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.100.0.101:33260","uuid":"becf7b18-33e7-41d0-b762-161563e1acac","connectionId":13896,"connectionCount":74}}
{"t":{"$date":"2022-09-23T09:18:27.412+02:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn13896","msg":"client metadata","attr":{"remote":"10.100.0.101:33260","client":"conn13896","doc":{"driver":{"name":"nodejs","version":"4.3.1"},"os":{"type":"Linux","name":"linux","architecture":"x64","version":"4.18.0-372.26.1.el8_6.x86_64"},"platform":"Node.js v14.19.3, LE (unified)|Node.js v14.19.3, LE (unified)"}}}
{"t":{"$date":"2022-09-23T09:18:36.871+02:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn13896","msg":"Connection ended","attr":{"remote":"10.100.0.101:33260","uuid":"becf7b18-33e7-41d0-b762-161563e1acac","connectionId":13896,"connectionCount":73}}
{"t":{"$date":"2022-09-23T09:18:36.872+02:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn13895","msg":"Connection ended","attr":{"remote":"10.100.0.101:58386","uuid":"182e808a-d530-4e18-9032-e06a5052de1f","connectionId":13895,"connectionCount":72}}

So it can connect, but fails going forward.

I noticed a typo:

MONGO_URL: mongodb://rc01.example.com:27017,rc02.example.com:27017,rc03.example.com:27017/rocketchat?authSource=admin&replicaSet=rc010&w=majority&directConnection=true

I changed “replicaSet=rc010” to “replicaSet=rc01” and it went a little further. Oddly, I copy&pasted the configuration from the old compose.yml to the new one. The typo is in the old compose file, but its working there, failing in 5.x.

With that I tried to connect to server 01, and fails because it’s not master. Which is correct, 02 is currently. The old docker container made the connect to 02 then, rocketchat 5.x container fails on the spot.

By sheer desperation I tried

      MONGO_URL: mongodb://rc02.example.net:27017/rocketchat?replicaSet=rc01&w=majority&directConnection=true
      MONGO_OPLOG_URL: mongodb://rc02.example.net:27017/local?replicaSet=rc01&directConnection=true

Leaving only the current master in there. It went further, but I stopped it before it got to upgrade the schemas. I can’t take a downtime during the day :slight_smile:

Did you change supplying multiple mongo servers in some way?

@creiss you need to remove directConnection from the connection string (don’t use that when you have multiple hosts). In simple words, that is forcing the driver to connect with “single” topology instead of ReplicaSet.{2,4}Primary; thus the selection error.

1 Like

That was it. Thank you very, very, very much.

The new 5.1.2 nodes are up and running with the 5.x mongo cluster. What a great way to start in the weekend!

1 Like