Description
We migrated our Multi-node / replica set RC solution to a new Kubernetes cluster, after the migration notifications from chat rooms no longer illuminate/notify of new messages reliably. Multiple clients types (iOS, Electron, Web, Android) are all affected but not all rooms/conversations.
Server Setup Information
- Version of Rocket.Chat Server: 1.3.2
- Operating System: k8s
- Deployment Method: Azure AKS
- Number of Running Instances: 3
- DB Replicaset Oplog: yes
- NodeJS Version: v8.11.4
- MongoDB Version: 4.0.12
- Proxy: Nginx
- Firewalls involved: n/m
Any additional Information
Deployment is leveraging Helm for both database AND rocketchat. https://github.com/helm/charts/tree/master/stable/rocketchat
The migration process has dump the database, reload the database on the new site and turn up RC replica.
Post migration the only odd errors I saw were
StreamBroadcast ➔ Stream.error Stream broadcast from ‘10.61.25.47:3000’ to ‘10.61.17.198:3000/’ with name notify-room not authorized
and these have subsided.
To add i also found
Error in oplog callback TypeError: Cannot read property 'u' of undefined
at BaseDb.Subscriptions.on (server/publications/subscription/emitter.js:19:39)
at emitOne (events.js:116:13)
at BaseDb.emit (events.js:211:7)
at BaseDb.processOplogRecord (app/models/server/models/_BaseDb.js:157:9)
at packages/mongo/oplog_tailing.js:105:7
at runWithEnvironment (packages/meteor.js:1356:24)
at Object.callback (packages/meteor.js:1369:14)
at packages/ddp-server/crossbar.js:114:36
at Array.forEach (<anonymous>)
at Function._.each._.forEach (packages/underscore.js:139:11)
at DDPServer._Crossbar.fire (packages/ddp-server/crossbar.js:112:7)
at handleDoc (packages/mongo/oplog_tailing.js:311:24)
at packages/mongo/oplog_tailing.js:337:11
at Meteor.EnvironmentVariable.EVp.withValue (packages/meteor.js:1304:12)
at packages/meteor.js:620:25
at runWithEnvironment (packages/meteor.js:1356:24)
Further troubleshooting tonight involved scaling the replicas of the rocketchat app down to 0 and then bringing it back up. 2 out of the 3 pods had the following errors
{ MongoError: ns not found │
│ at /app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/connection/pool.js:581:63 │
│ at authenticateStragglers (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/connection/pool.js:504:16) │
│ at Connection.messageHandler (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/connection/pool.js:540:5) │
│ at emitMessageHandler (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/connection/connection.js:310:10) │
│ at Socket.<anonymous> (/app/bundle/programs/server/npm/node_modules/meteor/npm-mongo/node_modules/mongodb-core/lib/connection/connection.js:453:17) │
│ at emitOne (events.js:116:13) │
│ at Socket.emit (events.js:211:7) │
│ at addChunk (_stream_readable.js:263:12) │
│ at readableAddChunk (_stream_readable.js:250:11) │
│ at Socket.Readable.push (_stream_readable.js:208:10) │
│ at TCP.onread (net.js:597:20) │
│ operationTime: Timestamp { _bsontype: 'Timestamp', low_: 15, high_: 1569381919 }, │
│ ok: 0, │
│ errmsg: 'ns not found', │
│ code: 26, │
│ codeName: 'NamespaceNotFound', │
│ '$clusterTime': │
│ { clusterTime: Timestamp { _bsontype: 'Timestamp', low_: 17, high_: 1569381919 }, │
│ signature: { hash: [Object], keyId: [Object] } }, │
│ name: 'MongoError', │
│ [Symbol(mongoErrorContextSymbol)]: {} }
I believe based on conversations on open.rocket.chat that this has been resolved by removing w majority and readPreference from connection url
@aaron.ogle that is correct, removing the &readPreference=nearest&w=majority
from the MONGO_URL
does appear to correct the notification issues we experienced. As a final take away it might be of note that the only reason we used those flags was due to this doc - https://rocket.chat/docs/installation/docker-containers/high-availability-install/
it might be worth while updating or putting a caveat on there to provide some context around when those flags are useful and when they are not.
2 Likes