Infos:
- Used Zammad version: 6.3.0
- Used Zammad installation type: docker-compose
- Operating system: Ubuntu 22.04.4 LTS (jammy)
- Browser + version: Chrome 124.0.6367.201
Expected behavior:
Scheduler able to finish tasks
Actual behavior:
Some scheduled tasks never finish (mostly Channel.stream, Sessions.jobs, but also others), showing errors in the zammad-scheduler container.
Yet some tasks seem to complete just fine. (The LDAP sync and Office 365 integrations do seem to update).
Monitor tab shows: Scheduler may not run (last execution of Chat.cleanup 5 days ago) - please contact your system administrator
Most descriptive error in logs is: ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use
.
Steps to reproduce the behavior:
docker-compose stop && docker-compose up -d
What I’ve tried to troubleshoot
- Create copy of
/opt/zammad/config/database/database.yml
, increase pool size to 500 and mount in all zammad containers (via shared-service yaml block) - Set environments vars
ZAMMAD_MAX_THREADS
,MAX_THREADS
,RAILS_MAX_THREADS
to 100 in docker-compose.yml - Change postgresql.conf
max_connections
parameter to value of500
Full scheduler log:
I, [2024-05-14T16:08:01.944727#1-149060] INFO -- : ProcessScheduledJobs running...
I, [2024-05-14T16:08:01.946071#1-149060] INFO -- : Running job thread for 'Process pending tickets.' (Ticket.process_pending) status is: sleep
I, [2024-05-14T16:08:01.946116#1-149060] INFO -- : Running job thread for 'Check 'Channel' streams.' (Channel.stream) status is: sleep
I, [2024-05-14T16:08:01.946133#1-149060] INFO -- : Running job thread for 'Execute import jobs.' (ImportJob.start_registered) status is: sleep
I, [2024-05-14T16:08:01.946153#1-149060] INFO -- : Running job thread for 'Generate 'Session' data.' (Sessions.jobs) status is: sleep
I, [2024-05-14T16:08:01.946169#1-149060] INFO -- : Running job thread for 'Update exchange oauth 2 token.' (ExternalCredential::Exchange.refresh_token) status is: sleep
I, [2024-05-14T16:08:01.946192#1-149060] INFO -- : Running job thread for 'Check channels.' (Channel.fetch) status is: sleep
I, [2024-05-14T16:08:01.946209#1-149060] INFO -- : Running job thread for 'Process automatic ticket unassignments.' (Ticket.process_auto_unassign) status is: sleep
I, [2024-05-14T16:08:01.946223#1-149060] INFO -- : Running job thread for 'Process ticket escalations.' (Ticket.process_escalation) status is: sleep
I, [2024-05-14T16:08:01.946240#1-149060] INFO -- : Running job thread for 'Handle data privacy tasks.' (DataPrivacyTaskJob.perform_now) status is: sleep
I, [2024-05-14T16:08:01.947104#1-149060] INFO -- : Running job thread for 'Clean up dead sessions.' (SessionTimeoutJob.perform_now) status is: sleep
I, [2024-05-14T16:08:01.947168#1-149060] INFO -- : Running job thread for 'Delete old online notification entries.' (OnlineNotification.cleanup) status is: sleep
I, [2024-05-14T16:08:01.947228#1-149060] INFO -- : Running job thread for 'Generate user-based stats.' (Stats.generate) status is: sleep
I, [2024-05-14T16:08:01.947540#1-149060] INFO -- : Running job thread for 'Close chat sessions where participants are offline.' (Chat.cleanup_close) status is: sleep
I, [2024-05-14T16:08:01.947646#1-149060] INFO -- : Running job thread for 'Execute planned jobs.' (Job.run) status is: sleep
E, [2024-05-14T16:08:02.918660#1-159100] ERROR -- : execute Ticket.process_auto_unassign (try_count 8) exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use> in: 5.000310106 seconds.
E, [2024-05-14T16:08:02.944379#1-159200] ERROR -- : execute SessionTimeoutJob.perform_now (try_count 1) exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use> in: 5.000251807 seconds.
E, [2024-05-14T16:08:03.944897#1-159220] ERROR -- : execute OnlineNotification.cleanup (try_count 1) exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use> in: 5.000280882 seconds.
E, [2024-05-14T16:08:04.943091#1-159140] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:04.944778#1-159240] ERROR -- : execute Stats.generate (try_count 1) exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use> in: 5.00021298 seconds.
E, [2024-05-14T16:08:05.907592#1-159080] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:05.907744#1-159080] ERROR -- : Failed to run ImportJob.start_registered after 11 tries.
E, [2024-05-14T16:08:05.940307#1-159120] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:05.943204#1-159160] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:05.945166#1-159260] ERROR -- : execute Chat.cleanup_close (try_count 1) exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use> in: 5.000192878 seconds.
E, [2024-05-14T16:08:06.943661#1-159180] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:06.945652#1-159280] ERROR -- : execute Job.run (try_count 1) exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use> in: 5.000225233 seconds.
E, [2024-05-14T16:08:07.918939#1-159100] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:07.944594#1-159200] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:08.945150#1-159220] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:09.945021#1-159240] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:10.908190#1-159080] ERROR -- : could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use (ActiveRecord::ConnectionTimeoutError)
lib/background_services/service/process_scheduled_jobs/job_executor.rb:95:in `retry_limit_reached'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:82:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:88:in `retry_execution'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:34:in `rescue in execute'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:24:in `execute'
lib/background_services/service/process_scheduled_jobs/job_executor/one_time.rb:6:in `run'
lib/background_services/service/process_scheduled_jobs/job_executor.rb:10:in `run'
lib/background_services/service/process_scheduled_jobs/manager.rb:84:in `block in start_in_thread'
lib/application_handle_info.rb:19:in `use'
lib/background_services/service/process_scheduled_jobs/manager.rb:82:in `start_in_thread'
lib/background_services/service/process_scheduled_jobs/manager.rb:73:in `block (2 levels) in start'
lib/background_services/service/process_scheduled_jobs/manager.rb:72:in `block in start'
E, [2024-05-14T16:08:10.945404#1-159260] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>
E, [2024-05-14T16:08:11.943715#1-159140] ERROR -- : execute ExternalCredential::Exchange.refresh_token (try_count 2) exited with error #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use> in: 5.000303536 seconds.
E, [2024-05-14T16:08:11.945841#1-159280] ERROR -- : Can't reconnect to database #<ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use>