Script/scheduler.rb not creating PID file

Infos:

  • Used Zammad version: 3.0.0
  • Used Zammad installation source: source
  • Operating system: Ubuntu 18.04
  • Browser + version: n/a

Expected behavior:

We are running two zammad instances and want only ever one scheduler to run, since when both run with bad timing, one of them wants to process an already processed e-mail and fails, generating an unprocessed email and an postgres error because the same primary key already exists. So far so good.

Judging from the daemons config with multiple: false I thought, it would be easy if I just symlink the tmp/pids directory via glusterfs, and if server2 is first to start the scheduler, server1 would notice an already existing PID file and not start a scheduler of its own.

So I would expect the following:

  • Starting script/scheduler.rb should create a pidfile in /var/www/zammad/current/tmp/pids that is called scheduler.pid.

This I assume from the config of the daemons Gem from script/scheduler.rb:

daemon_options = {
  multiple:  false,
  dir_mode:  :normal,
  dir:       File.join(dir, 'tmp', 'pids'),
  backtrace: true
}

Actual behavior:

  • There is no PID file with the PID of the scheduler.

Steps to reproduce the behavior:

  • Install zammad
  • run bundle exec script/scheduler.rb start -t (I do not know what the -t is for)

Workaround

I can think of just adding a primitive locking method to script/scheduler.rb but maybe this is something that could be used (or in case of bug, corrected) upstream as well.

Oh boy, I think you lost me here.

Are you trying to built some HA setup?
Reason I ask: I don’t see why two different instances should have just one scheduler (they should have their own scheduler each, they also shouldn’t care about other instances).

If you try to built a high avalability setup, please note that -because of complexibitly, we can’t support you with this in any way. It’s also not supported by Zammad (officially) at the moment, so you’ll need tons of fiddling.

Hi!

Yes, sorry if that has not been clear. We are running Zammad in a kind-of-HA-way.

We value ours mails and uptime so there is no other way we can trust a single node being down because of updates or planned downtimes and our team effectively can’t work and our customers can’t reach us with the preferred way to communicate. Also SLAs.

But that answers my question completely. We will need to patch the scheduler.rb so that it can detect if another scheduler is already running, so its fine. Everything else kinda/sorta works but its just been running for a week.
Maybe we’re deciding from using two instances at the same time for loadbalancing purposes to a “cold spare” instance available in case one fails or needs to be taken offline for maintenance.

Thanks for your time!

Glad I could help at least a bit. :slight_smile:

I think that’s the better and easier approach.
Please keep in mind to also synchronize your filesystem :slight_smile:

We’re in a testing phase right now, and will try that way once we have gathered more things that might be caused by running on two instances at the same time, so that we can see if they go away after switching. If I remember I’ll try to update this post :+1:

Parts of the filesystem are shared via gluster (which is a whole thing on its own…) and we’re reasonably sure to have synced almost everything relevant.

1 Like

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.