MongoDB killed by OOM killer because of GridFS used in production - keeping rocket.chat 6.13.1 with mongodb 4.4.25 alive

Description

I have taken over administration of a small self-hosted server with rocketchat with mongodb 4 in 2022, I kept it running ever since, with the plan to move our community to a not self-hosted instance in April 2025. I would like to keep the EOF instance running until then. It worked well until two days ago. Can you please help me troubleshoot this problem?

Since 2 days (Sunday 19.1. 21:00 GMT) the mongodb keeps breaking down because it “has been killed by the OOM killer” again and again.
I routinely tried to update to the newest version of rocket chat that is compatible with the server (i.e. 6.13.1, as the server can only run mongodb 4.4 because of lacking AVX-flag). Before that it ran 6.13.0. But it did not help.
When starting the rocketserver after updating I got some “No real time data received recently” messages similar to THIS, in the rocketchat logs I found an error “Some indexes for collection ‘rocketchat_uploads’ could not be created”, so I deleted the Indexes in mongo db using db.rocketchat_uploads.dropIndexes();, which made it startup and realtime chatting possible, but now I have the following problem:

When I now restart mongodb and rocketchat, it will run for some while, but then the mongodb breaks down and with it rocketchat will be dysfunctional.

I have 4GB RAM on that machine, so I tried limiting the memory limit for mongodb to 1GB (with sudo systemctl edit mongod and setting [Service] MemoryLimit=1024M), but still it gets killed by the OOM after some time running rocketchat (does it run into a MemoryLimit our a timelimit? it dies with `systemd[1]: mongod.service: A process of this unit has been killed by the OOM killer.). I tried running mongodb without rocketchat and it consumes around 250MB without crashing. Only when I start rocketchat, after some while it stops. Emitting some “Slow query” with duration over 1s in the mongodb logs.

Do you have any ideas what is going on here and on how to troubleshoot this? Is there some possibility to improve the performance of the mongodb before starting rocketchat?

I would like to keep the server running. It would be acceptable to disable uploading files, or even getting rid of all uploaded files (if there would be a method to download all uploaded files before that – they are backedup already with a mongodump). Please help :).

Server Setup Information

  • Version of Rocket.Chat Server: 6.13.1
  • Operating System: Debian GNU/Linux
  • Deployment Method: tar
  • Number of Running Instances: 1
  • DB Replicaset Oplog: Enabled
  • NodeJS Version: 14.21.3 - x64
  • MongoDB Version: 4.4.25
  • Proxy: nginx
  • Firewalls involved: no

Any additional Information

a log message in /var/log/mongodb/mongod.log of a “Slow query” before mongodb breaking down

{"t":{"$date":"2025-01-21T09:30:01.958+01:00"},"s":"I",  "c":"COMMAND",  "id":51803,   "ctx":"conn15",
"msg":"Slow query","attr":
{"type":"command","ns":"b.rocketchat_uploads.chunks","command":
{"find":"rocketchat_uploads.chunks","filter":{"files_id":"6787e7aff661709357965623"},"sort":
{"n":1},"limit":0,"lsid":{"id":{"$uuid":"6e1a5cb1-91b8-4796-aaea-75b9819211b5"}},"$clusterTime":
{"clusterTime":{"$timestamp":{"t":1737448190,"i":10}},"signature":{"hash":{"$binary":
{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"$db":"b"},
"planSummary":"COLLSCAN","keysExamined":0,"docsExamined":3816,"hasSortStage":true,
"cursorExhausted":true,"numYields":278,"nreturned":1,"queryHash":"9FE7AF15",
"planCacheKey":"9FE7AF15","reslen":152575,"locks":{"FeatureCompatibilityVersion":
{"acquireCount":{"r":279}},"ReplicationStateTransition":{"acquireCount":{"w":279}},"Global":
{"acquireCount":{"r":279}},"Database":{"acquireCount":{"r":279}},"Collection":{"acquireCount":
{"r":279}},"Mutex":{"acquireCount":{"r":1}}},"storage":{"data":
{"bytesRead":878971280,"timeReadingMicros":5332465}},"protocol":"op_msg",
"durationMillis":5436}}

How many users?

What type of file storage?

License: Community
Seats: 77 seats / users
File Storage: GridFS

As expected.

Fix that - GridFS should NOT be used in Production. And the lesson is always read ALL the documentation as most of your answers are already there.

https://docs.rocket.chat/docs/recommendations-for-file-upload

By default, GridFS is used in Rocket.Chat for file storage because MongoDB offers this functionality with zero configuration. However, it is not recommended for production environments due to the high load it places on the database.

There are two migrator repos.

Note also. Though your Mongo is ‘permitted’ by Rocket.Chat it is EOL and NOT supported by anyone. You should move ASAP, and preferably to Mongo 6+ as Mongo 5 is EOL soon.

Repos also noted here:

1 Like

okay, that’s embarassing. thanks a lot for your pointers :slight_smile: will follow them and report afterwards.

:rofl::rofl::rofl:

Happens to the best of us.

Thanks, the “RocketChat GridFS to filesytem migration script” did the job. Now Rocket.Chat is alive again! I also read your FAQ and will come back to it before posting a new question here :).

Yay!!

Glad you got it sorted.