Mongo is not available, can't start. Waiting 30 seconds and trying again

Description

Hi,
I have a problem related to mongodb. When accessing the rocket via browser, the error appeared: 502 Bad Gateway. So I followed the logs to find a problem when the rocket tries to start mongodb.
With the command “sudo service snap.rocketchat-server.rocketchat-server status” I notice that rocket-mongo does not start the service:
root@ip-172-30-0-43:/var/snap/rocketchat-server/common# sudo service snap.rocketchat-server.rocketchat-server status
● snap.rocketchat-server.rocketchat-server.service - Service for snap application rocketchat-server.rocketchat-server
Loaded: loaded (/etc/systemd/system/snap.rocketchat-server.rocketchat-server.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2020-05-12 21:05:53 -03; 1min 23s ago
Main PID: 7446 (startRocketChat)
Tasks: 2
Memory: 408.0K
CPU: 77ms
CGroup: /system.slice/snap.rocketchat-server.rocketchat-server.service
├─7446 /bin/bash /snap/rocketchat-server/1436/bin/startRocketChat
└─7488 sleep 30

May 12 21:05:53 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-server.service: Service hold-off time over, scheduling restart.
May 12 21:05:53 ip-172-30-0-43 systemd[1]: Stopped Service for snap application rocketchat-server.rocketchat-server.
May 12 21:05:53 ip-172-30-0-43 systemd[1]: Started Service for snap application rocketchat-server.rocketchat-server.
May 12 21:05:53 ip-172-30-0-43 rocketchat-server.rocketchat-server[7446]: Mongo is not available, can’t start. Waiting 10 seconds and trying again
May 12 21:06:03 ip-172-30-0-43 rocketchat-server.rocketchat-server[7446]: Mongo is not available, can’t start. Waiting 15 seconds and trying again
May 12 21:06:18 ip-172-30-0-43 rocketchat-server.rocketchat-server[7446]: Mongo is not available, can’t start. Waiting 20 seconds and trying again
May 12 21:06:38 ip-172-30-0-43 rocketchat-server.rocketchat-server[7446]: Mongo is not available, can’t start. Waiting 25 seconds and trying again
May 12 21:07:03 ip-172-30-0-43 rocketchat-server.rocketchat-server[7446]: Mongo is not available, can’t start. Waiting 30 seconds and trying again

With the command “sudo systemctl status snap.rocketchat-server.rocketchat-mongo.service” I verified that the service has a failure to start:
● snap.rocketchat-server.rocketchat-mongo.service - Service for snap application rocketchat-server.rocketchat-mongo
Loaded: loaded (/etc/systemd/system/snap.rocketchat-server.rocketchat-mongo.service; enabled; vendor preset: enabled)
Active: failed (Result: start-limit-hit) since Tue 2020-05-12 20:24:33 -03; 46min ago
Process: 6096 ExecStart=/usr/bin/snap run rocketchat-server.rocketchat-mongo (code=exited, status=100)
Main PID: 6096 (code=exited, status=100)

May 12 20:24:33 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Main process exited, code=exited, status=100/n/a
May 12 20:24:33 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Unit entered failed state.
May 12 20:24:33 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Failed with result ‘exit-code’.
May 12 20:24:33 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Service hold-off time over, scheduling restart.
May 12 20:24:33 ip-172-30-0-43 systemd[1]: Stopped Service for snap application rocketchat-server.rocketchat-mongo.
May 12 20:24:33 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Start request repeated too quickly.
May 12 20:24:33 ip-172-30-0-43 systemd[1]: Failed to start Service for snap application rocketchat-server.rocketchat-mongo.
May 12 20:24:33 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Unit entered failed state.
May 12 20:24:33 ip-172-30-0-43 systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Failed with result ‘start-limit-hit’.

Then I used the command “sudo journalctl -u snap.rocketchat-server.rocketchat-mongo” to get more information about the problem:
I CONTROL [initandlisten] MongoDB starting : pid=6118 port=27017 dbpath=/var/snap/rocketchat-server/common 64-bit host=ip-1
I CONTROL [initandlisten] db version v3.4.20
I CONTROL [initandlisten] git version: 447847d93d6e0a21b018d5df45528e815c7c13d8
I CONTROL [initandlisten] allocator: tcmalloc
I CONTROL [initandlisten] modules: none
I CONTROL [initandlisten] build environment:
I CONTROL [initandlisten] distarch: x86_64
I CONTROL [initandlisten] target_arch: x86_64
I CONTROL [initandlisten] options: { net: { bindIp: “127.0.0.1” }, processManagement: { pidFilePath: "/var/snap/rocketchat-
I - [initandlisten] Detected data files in /var/snap/rocketchat-server/common created by the ‘wiredTiger’ storage eng
I STORAGE [initandlisten]
I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=487M,session_max=20000,eviction=(threads_min=4,threads_
E STORAGE [initandlisten] WiredTiger error (-31803) [1589325873:8409][6118:0x7f93934abd00], txn-recover: Recovery failed: W
I - [initandlisten] Assertion: 28595:-31803: WT_NOTFOUND: item not found src/mongo/db/storage/wiredtiger/wiredtiger_k
I STORAGE [initandlisten] exception in initAndListen: 28595 -31803: WT_NOTFOUND: item not found, terminating
I NETWORK [initandlisten] shutdown: going to close listening sockets…
I NETWORK [initandlisten] removing socket file: /tmp/mongodb-27017.sock
I NETWORK [initandlisten] shutdown: going to flush diaglog…
I CONTROL [initandlisten] now exiting
I CONTROL [initandlisten] shutting down with code:100
systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Main process exited, code=exited, status=100/n/a
systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Unit entered failed state.
systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Failed with result ‘exit-code’.
systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Service hold-off time over, scheduling restart.
systemd[1]: Stopped Service for snap application rocketchat-server.rocketchat-mongo.
systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Start request repeated too quickly.
systemd[1]: Failed to start Service for snap application rocketchat-server.rocketchat-mongo.
systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Unit entered failed state.
systemd[1]: snap.rocketchat-server.rocketchat-mongo.service: Failed with result ‘start-limit-hit’.

The error found in the log is this:
STORAGE [initandlisten] WiredTiger error (-31803) [1589325873:8409][6118:0x7f93934abd00], txn-recover: Recovery failed: WT_NOTFOUND: item not found

Did you go through this problem?
I appreciate any help that comes up.

Server Setup Information

  • Version of Rocket.Chat Server:
    Name Version Rev Tracking Publisher Notes
    core 16-2.44.3 9066 latest/stable canonical✓ core
    rocketchat-server 2.4.12 1436 latest/stable rocketchat✓ -

  • Operating System:
    DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=16.04
    DISTRIB_CODENAME=xenial
    DISTRIB_DESCRIPTION=“Ubuntu 16.04.2 LTS”
    NAME=“Ubuntu”
    VERSION=“16.04.2 LTS (Xenial Xerus)”

  • Deployment Method:

  • Number of Running Instances:
    01

  • DB Replicaset Oplog:

  • NodeJS Version:

  • MongoDB Version:
    db version v3.0.15
    git version: b8ff507269c382bc100fc52f75f48d54cd42ec3b

  • Proxy:

  • Firewalls involved:

Any additional Information

The error started without changing the server, the first error message was observed when accessing the rocket through the browser: 502 Bad Gateway

I had kind of a similar problem yesterday which popped out of the blue. Mongodb failed to start and complained about

WiredTiger error (-31802) [1589317956:154007][4136:0x7f1220932d00], txn-recover: unsupported WiredTiger file version: this build  only supports major/minor versions up to 1/0,  and the file is version 2/0

According to the snap logs (snap changes) this problem came through a snap update yesterday at 21:11 CEST and it vanished by itself with the next snap update at 05:03 CEST.

So I think the first update had a problem and it was fixed with the second update… I’m now searching the Internet for a confirmation of this Bug and why it happend and what has been changed to make it disapear.

Does it help to refresh your snap package?

Thomas

Strange. Before all the problems my mongodb version was:

2020-05-04T07:56:27.069+0200
db version v3.6.14
git version: cbef87692475857c7ee6e764c8f5104b39c342a1

than the snap update came along:

2020-05-12T21:11:42.563+0200
db version v3.4.20
git version: 447847d93d6e0a21b018d5df45528e815c7c13d8

And then the fix came along:

2020-05-13T05:03:17.252+0200
db version v3.6.14
git version: cbef87692475857c7ee6e764c8f5104b39c342a1

Could it really be that the snap update installed an old version of mongodb which -of course- could not startup because it could not work with the files of the newer mongodb version which run before?

Hi Thomas,
The problem started yesterday also here in the rocket, I will try to return the so snap version to analyze the situation.
Thanks for sharing!

Hi Thomas, I was taking too long to solve, so I returned a snapshot of the server. I believe that the automatic update of the rocket has damaged the database. I took advantage and changed it to the stable version.