Zammad freezing / Dropping Emails under Load

Infos:

Zammad Version 5.3.1-1674057058.5f657f26.centos7
OS: Centos 7
Kernel: 3.10.0-1160.81.1.el7.x86_64
Installed from package.

Hardware:
CPU: 8 cores 16 HT Intel Skylake
Ram: 32Gb DDR4
Disk: Cloud SSD 8250 IOPS / 110mb/s Read/write guaranteed

System is average on about 50% CPU Load, and 60% RAM Usage.

cat /etc/zammad/conf.d/other

export RAILS_LOG_TO_STDOUT=""
export WEB_CONCURRENCY="8"
export RUBY_MALLOC_ARENA_MAX="2"
export RUBY_GC_MALLOC_LIMIT="1077216"
export RUBY_GC_MALLOC_LIMIT_MAX="2177216"
export RUBY_GC_OLDMALLOC_LIMIT="2177216"
export RUBY_GC_OLDMALLOC_LIMIT_MAX="3000100"

cat /etc/elasticsearch/elasticsearch.yml

network.host: 0.0.0.0
discovery.type: single-node
http.max_content_length: 400mb
indices.query.bool.max_clause_count: 2048

Sessions.list.count
Is normally between 20 and 30.

Expected behavior:

Zammad Running Maybe a bit slower on High load but still be able to function correctly and sending mails.

Actual behavior:

Zammad Monitoring Api Check Shows Unhealthy when the error occurs.
Zammad is running slow, nearly unusable under daily load.
Emails getting Dropped / not sended.

Out of production log:

E, [2023-02-23T10:29:30.366324 #697-940260] ERROR -- : thread_client 134980 exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.014 seconds); all pooled connections were in use>
E, [2023-02-23T10:29:30.374367 #697-940260] ERROR -- : /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:210:in `block in wait_poll'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:199:in `loop'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:199:in `wait_poll'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:160:in `internal_poll'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:286:in `internal_poll'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:155:in `block in poll'
  /opt/zammad/vendor/ruby-3.0.4/lib/ruby/3.0.0/monitor.rb:202:in `synchronize'
  /opt/zammad/vendor/ruby-3.0.4/lib/ruby/3.0.0/monitor.rb:202:in `mon_synchronize'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:164:in `synchronize'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:155:in `poll'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:870:in `acquire_connection'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:588:in `checkout'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:428:in `connection'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:1128:in `retrieve_connection'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_handling.rb:327:in `retrieve_connection'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/connection_handling.rb:283:in `connection'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/core.rb:490:in `cached_find_by_statement'
  /opt/zammad/vendor/bundle/ruby/3.0.0/gems/activerecord-6.1.7.1/lib/active_record/core.rb:386:in `find_by'
  /opt/zammad/app/models/application_model/can_lookup.rb:29:in `lookup'
  /opt/zammad/lib/sessions/client.rb:36:in `block in fetch'
  /opt/zammad/lib/sessions/client.rb:25:in `loop'
  /opt/zammad/lib/sessions/client.rb:25:in `fetch'
  /opt/zammad/lib/sessions/client.rb:9:in `initialize'
  /opt/zammad/lib/sessions.rb:579:in `new'
  /opt/zammad/lib/sessions.rb:579:in `thread_client'
  /opt/zammad/lib/sessions.rb:534:in `block (3 levels) in jobs'

When the Error Occurs, only a restart from Zammad can solve the problem.

I’m Happy to provide more Information, actually we have to reboot our Zammad nearly 3 to 4 Times a day to keep it functional.
Sometimes with the loss of Not sent / Dropped Emails.

If anyone have an idea whats wrong with our Instance that would be awesome.
Thanks
Moritz

Have you increased the max_connections of psql? The default is 100 and not enough for Zammad.