Zammad outage spikes

Gogsy · February 9, 2026, 1:02pm

Used Zammad version: Zammad version 6.5.2-1769767752.d278bdb0.noble
psql (PostgreSQL) 16.11
elasticsearch 8.19.10
Used Zammad installation type: package
Operating system:
Server 1 (Zammad app): OS: Ubuntu 24.04.3 LTS
Server 2 (Postgres DB): OS: Ubuntu 24.04.3 LTS
Server 3 (ElasticSearch): OS: Ubuntu 24.04.3 LTS
Browser + version: [Brave 1.86.148 (Official Build) (64-bit)]

Hello,

I’m experiencing intermittent spikes in Zabbix monitoring related to Zammad Web UI availability.

I configured a Web Scenario in Zabbix to monitor the Zammad web interface and alert when the site becomes unavailable. Periodically, Zabbix reports “Zammad Web UI is down”. At the exact moment of the alert, the web interface noticeably freezes/lag for a few seconds and then immediately recovers, continuing to operate normally.

What makes this difficult to diagnose:

No errors in Zammad logs
No Nginx or application errors
No CPU, RAM, or I/O spikes
No network drops or connectivity issues
No visible service restarts
Zabbix server itself shows no resource spikes

The behavior looks like a short application stall (2–5 seconds), not a full outage.
Frequency varies sometimes once per day, sometimes every 30 minutes.

Has anyone encountered similar short “micro-outages” or brief UI stalls without corresponding log errors?
What would be the best way to instrument or test this further to identify the root cause (request timing, upstream latency, Puma workers, DB waits, etc.)?

MrGeneration · February 9, 2026, 6:30pm

With that heavy seperation, I’d assume you have quite a lot of users.
So you might want to provide the concurrent users and your performance tuning along.

Because this is extremely relevant for what anyone could answer here.

Gogsy · February 9, 2026, 6:50pm

I would say aprox. 20 agents.
Customers around 200 but 20 mybe are using the portal to open tickets rest are using email. And never go on the portal.

We are loking to expand agents to about 45 latter, but i wan’t to solve this if possible before that

Gogsy · February 9, 2026, 6:52pm

Also we would have alot of customers when we expand to 45 agents… so far we are only internal before we transfer to outside customers.

MrGeneration · February 9, 2026, 7:01pm

So you have zero performance tunings active?

Gogsy · February 9, 2026, 8:01pm

Yes, we have not tuned anything.
Mainly because the system is working stable, no spikes in usage, nothing.
On app server we have 16% cpu usage stable, ram 8%.
Db server 2% cpu, 3% ram
Elastic 63%ram 7% cpu.

Unless you think i should do some tuning?

MrGeneration · February 9, 2026, 8:05pm

Well yes, no wonder the web interface begins to be slow and unresponsive when people are using it.

You can learn more in the documentation

or… if you prefer video content, I am explaining that stuff in this video.

Gogsy · February 9, 2026, 8:18pm

Thank you for this, i will set it up, and get back to you on this matter, with feedback.

Gogsy · February 10, 2026, 10:42am

I did some performance tuning
I have barely touched the server load, we have planned on a extreme scale so we will have to keep any eye on this.

I will keep you informed if this has fixed the issue

MrGeneration · February 10, 2026, 3:14pm

Yes tinkering around will take some iterations if you ramp up the traffic at some point. But it’s definitely managable.

Gogsy · February 23, 2026, 8:39am

Hi, managed to get it working now.
Here and there we get a small spike, but not 20 times in a day as before, so performance will still need some tinkering with.
Thank you for this

system · March 2, 2026, 8:40am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.