High CPU making chat unusable

cchbr · August 8, 2020, 4:39pm

Description

My instance of RC is becoming unusable since it consumes so much CPU. Mongo consumes about 400% CPU on a large server with only a few users online. This started a few weeks ago and is gradually getting worse, currently anonymous login isn’t even working since it can’t load the username suggestions.

I considered it may be due to an attack but I’ve enabled DDOS protection from Cloudflare and the result is the same. There are however no problems on my dev server, the problem start on production when there are users.

It would be much appreciated if anyone could help sort this out. I have previously restored an old backup of the server since I initially thought it was due to updating to 3.5.0, but a couple of days after restoring the backup the problems arose again.

Server Setup Information

Version of Rocket.Chat Server: 3.0.12 and 3.5.0
Operating System: Ubuntu
Deployment Method: Docker-compose
Number of Running Instances: 1
MongoDB Version: 3.6
Proxy: Nginx

sing.li · August 9, 2020, 6:28am

@cchbr Many are running Rocket.Chat on a Raspberry Pi 4 and handling hundreds of users with little reported problem.

What “server” are you using? Please provide some details:

is it a physical box you own? VM? VPS?
what is the size of memory, how many cores, disk space?
if it is in the cloud, which one?
which version of operating system is it running on
what else, other than Rocket.Chat are you running on the “server”

cchbr · August 9, 2020, 10:54am

It’s a VPS.
6 vCPUs,16 GB Memory, 320 GB SSD.
It’s a droplet at Digital Ocean.
Ubuntu 18.04.3 LTS
I’m also running a Wildfly server on it but it’s barely used by any users and doesn’t consume a lot of resources.

I thought of starting a completely new instance, in case something has become corrupted in my current one. But I suppose migrating all the data will be troublesome since dumping the whole DB will probably result in the same issues on the new server.

sing.li · August 9, 2020, 8:08pm

Yes. @cchbr The large v-memory vps PLUS memory-hungry Wildfly PLUS Mongo are what is likely to be causing your problem.

Please read about how mongo allocate memory here https://docs.mongodb.com/v4.2/core/wiredtiger/

Your best bet is to isolate mongoDB on its own VPS instance.

toddy · August 10, 2020, 2:29am

Run mongodb in a docker container or use cgroups to limit its memory usage.

cchbr · August 10, 2020, 12:19pm

Thanks so much for your responses.

It’s deployed with compose-compose and I have now tried to limit memory usage with mem_limit: 2048m, I also tried using the cpu_shares limit on the mongo container. But mongod still consumes 200-400% CPU.

I’ll continue to monitor and see if there are any improvements but I’ll try to isolate mongo on it’s own instance as well. Are there any specific settings that has to be set to run mongo on a separate server, or will it be enough to run mongo and mongo-init-replica on it’s own server in docker-compose, and point the current rocketchat service to that server?

aaron.ogle · August 11, 2020, 2:47am

One thing you can also do is take a look at the logs generated by mongo. It’s likely some queries are slow.

Also are you using gridfs and uploading a lot of files? If so this can contribute to mongo stress a lot as those files are stored inside of mongo

cchbr · August 11, 2020, 1:10pm

Yes I’m using GridsFS, there are about 80GB of files uploaded and approximately 1000 files are uploaded per day, mainly images.

I have setup an isolated server for mongo but uploads aren’t working with GridFS now, it keeps looking for the images on the app server. I’ve changed to FileSystem which uploads new files, but the progress meter gets stuck on 0% despite the file being uploaded. And none of the already uploaded files are available since it still uses the app server url for those as well.

It’s also very unpredictable right now, a lot of the times it won’t even load a channel, it just gets stuck at loading.

cchbr · August 11, 2020, 2:07pm

Here’s the docker-compose for the db server:

version: '2'

services:
  mongo:
    image: mongo:3.6
    restart: unless-stopped
    volumes:
     - /mnt/volume_lon1_01/data/runtime/db:/data/db
     - /mnt/volume_lon1_01/data/dump:/dump
    command: mongod --smallfiles --oplogSize 128 --replSet rs0 --storageEngine=mmapv1
    labels:
      - "traefik.enable=true"
    ports:
     - 27017:27017

  # this container's job is just run the command to initialize the replica set.
  # it will run the command and remove himself (it will not stay running)
  mongo-init-replica:
    image: mongo:3.6
    command: 'bash -c "for i in `seq 1 30`; do mongo mongo/rocketchat --eval \"rs.initiate({ _id: ''rs0'', members: [ { _id: 0, host: ''localhost:27017'' } ]})\" && s=$$? && break || s=$$?$
    depends_on:
      - mongo

And here’s the docker-compose for the app server:

version: '2'

services:
  rocketchat:
    image: rocketchat/rocket.chat:3.0.12
    logging:
        driver: "json-file"
        options:
            max-file: "10"
            max-size: "50m"
    command: bash -c 'for i in `seq 1 30`; do node main.js && s=$$? && break || s=$$?; echo "Tried $$i times. Waiting 5 secs..."; sleep 5; done; (exit $$s)'
    restart: unless-stopped
    volumes:
      - ./uploads:/app/uploads
    environment:
      - PORT=3000
      - ROOT_URL=https://chat.domain.com
      - MONGO_URL=mongodb://64.217.43.48:27017/rocketchat
      - MONGO_OPLOG_URL=mongodb://64.217.43.48:27017/local
      - MAIL_URL=smtp://smtp.email
#       - HTTP_PROXY=http://proxy.domain.com
#       - HTTPS_PROXY=http://proxy.domain.com
    ports:
      - 3000:3000
    labels:
      - "traefik.backend=rocketchat"
      - "traefik.frontend.rule=Host: your.domain.tld"

aaron.ogle · August 14, 2020, 7:42am

Usually if it’s getting stuck uploading it can’t write to the folder.

I’d recommend moving files to an object store. Minio is a nice easy self hosted one.

There is a community built tool that might be of use here: https://github.com/arminfelder/gridfsmigrate

Also depending on how many users connect… might be time to scale horizontally and add another rocket.chat instance

cchbr · August 14, 2020, 11:07am

New issues are just popping up each day with this software. When I activate FileSystem uploads it doesn’t store them in the correct folder, but in a folder such as /var/lib/docker/overlay2/68a9f50e946d6f986b0cf19662a036eb1b1be74f196aa99fbb3aaafc25c96c6c/merged/app/bundle/programs/server/uploads. And as soon as I restart the container it breaks and all images return 404 even the avatars, probably because the container id changes on restart. Even if I run all containers as root and set the path for uploads to /uploads it won’t save anything to that folder, only to the folder relative to docker.

Usually about 70-100 users are connecting simultaneously, so you’d think that 6 vCPUs and 16 GB memory would be enough, but more users are leaving each day since it’s so unstable.

Topic		Replies	Views
CPU usage at 100% Community Support rocketchat-apps	5	1520	April 6, 2022
Excessive resource usage: rocketchat Community Support	1	644	November 1, 2019
Rocketchat and mongodb high cpu utilization Community Support	3	2122	October 6, 2020
RocketChat unusable, very slow, hangs etc Community Support	6	2842	July 24, 2023
Rocker chat version updating issue 4.8 to 6.4 Community Support rocketchat-apps , docker	2	526	November 21, 2023

[Secure CommsOS™ Launch] Join our next Roadmap Reveal webinar to learn more Register now ->

High CPU making chat unusable

Description

Server Setup Information

Related topics