We are considering to use Rocket.Vhat (deployed on a standalone server) for our company and we have questions regarding performance:
- Is Rocket.Chat suitable for 200000-300000 users (with several thousands simultaneously active users)?
- If yes, what server configuration do we need to handle this amount of users?
Rocket.Chat Cloud (https://rocket.chat/cloud) most expensive tier supports only 50000 users.
That’s why we are worried will our dedicated server be able to handle 300000 users.
Thanks for posting the question in a place we can actually write a bit of a response.
First off, our cloud plans should absolutely not be taken as a sign of any limitations. From our research these are the plans we decided to offer.
Another thing to make clear, the real factor in planning your server is the number of connected users. Those are the users that will be actually adding load to the system. The number of total users is typically not a factor. Take our community server for example: open.rocket.chat It has 200k registered users.
Have you taken a look at our scaling document? https://rocket.chat/docs/installation/manual-installation/multiple-instances-to-improve-performance
Aaron, thanks for your reply.
What are the limitations for the number of connected users?
We’ve setup an instance of Rocket.Chat, with replica set and oplog tailing.
We’ve also configured mongodb logrotate to avoid large extremely large log files.
First, we’ve created a small amount of users and checked the login time (via websocket API).
The “login” method takes several (4-5) seconds to execute.
Then, we’ve added even more users (150 000+) and checked the login time.
After that, the"login" method takes 20-30 seconds to execute.
Moreover, we didn’t have any significant number of connections to our Rocket.Chat server at that moment, so I assume that running several Rocket.Chat instances won’t help us in this case.
That’s why we are worried about Rocket.Chat performance correlation with the overall number of users.
connected users is purely dependent on configuration. We don’t have a known limit we haven’t hit it.
These queries sounds like something is wrong in your configuration.
What does the hardware look like underneath? If cheap VPS i’ve also seen performance like this.
Also if you run mongo and Rocket.Chat on the same machine they do fight for resources.
Also again if you plan to scale make sure mongo is running mmapv1 for best performance between it and Rocket.Chat. It is better suited for Rocket.Chat’s query patterns.
Aaron, thanks for your answer.
We’ve made a couple of changes to to improve our performance.
Maybe it will be useful for other people struggling with performance issues:
- We’ve upgraded our VPS.
- We’ve moved mongo to a different machine.
- We’ve changed the file system on disk with mongo to XFS.
- We didn’t change mongo storage engine (it’s still WiredTrigger).
- We’ve re-enabled token expiration (because users with more than 1000 tokens tend to slow down login greatly for every user).
- We’ve turned on removal of expired tokens.
- We’ve added another auth options (auth key) for our custom integrations (previously we’ve logged in as admin to call some APIs) so that we don’t need to create more auth tokens.
- We’ve made other optimizations regarding user creation to suit our needs (for instance, we don’t need an email) so that we make less queries to mongo.
Right now, we are working on a system to simulate real chat usage (thousands of users logging in and writing messages to each other).
Once we are done, I think I can post our results regarding the number of simultaneously connected users that we can handle.
Awesome! Thanks for the detailed answer. What are you using to test? We are working on a few things to test scale and would be very interested to hear your approach and or tooling to test.
I don’t know all the details as this tool is developed by another team.
But here are some general ideas:
- We have a worker node (a simple .jar file) and a separate service with web ui to control nodes.
- Each node connects to the main service and to commands.
- Each node can start several processes (based on the thread limit) to simulate chat usage by a real person (most common case like login/logout/creating new subscriptions/rules/writing to another people).
- Each node sends statistics and logs to the main service.
Using this approach we are able to deploy any number of nodes in different places while having a single place to manage the simulation and have all statistics.
If we’d run a simulation using only one server, it could turn out that we’ve reached it’s throughput limit and our simulation is not correct.
We are also planning to run RC for same number of users(200000-300000 users (with several thousands simultaneously active users)).please help us to choose best server configuration.(cpu?ram?storage? and salable servers architecture ). also please share your performance test tools and result .
Hi @aaron.ogle and @nikita.fidirko
Did you come up with a benchmarking solution in the end?
Is RocketChat able to handle the said number of users and which deployment configuration is recommended for such a large number of users?
What is the tech stack you are using to simulate real chat usage? What are your results and can u share?
we sre also planning to run RC for about 250000 active users. Do you solve the problem successfully?