SMTP 550 "Callout verification failed" on single address aborts rest of notifications

This might be an edge case due to some weird form of email forwarding or allowlisting being done on a remote MX, because I know Zammad will usually log an error in a ticket when an e-mail fails to be delivered to a recipient, so the behaviour below is puzzling me.

Infos:

  • Used Zammad version: 5.3.1-1671611341.37743b14.focal
  • Used Zammad installation type: Package
  • Operating system: Ubuntu 20.04.5 LTS (Focal Fossa)

Expected behavior:

When Zammad sends notifications to all agents about new tickets, or ticket updates and encounters a specific SMTP error on one of the email-addresses, it logs an error message about the address and skips to the next one in the list of agents to try to notify.

Actual behavior:

It seems Zammad stops sending pending notifications entirely if a single agent e-mailaddress causes a ā€œcallout verification failedā€ event on a remote MX. It seems to crash the channel with a fatal error. It does show up in the monitoring, but with little extra information.
The ticket history shows two notifications were sent, the last one in the list appears to be the one which crashed the driver (marked with an arrow). No error is displayed in either the ticket history or the ticket itself.

Steps to reproduce the behavior:

You need to have a couple of agents and at least one agent with an email-address which produces an SMTP error on a remote MX, for example, this list of agents:

agent1@aaa.example.com
agent2@bbb.example.com
agent3@ccc.example.com
agent4@ddd.example.com

If the mail for agent2 is rejected with the callout verification failed error, agents 3 and 4 wonā€™t receive a notification (ever).

Logging

Zammad production.log:

E, [2023-02-17T12:10:29.768825 #1656135-112080] ERROR -- : Can't use Channel::Driver::Smtp: #<Net::SMTPFatalError: 550-Callout verification failed:
>
E, [2023-02-17T12:10:29.768931 #1656135-112080] ERROR -- : 550-Callout verification failed:
 (Net::SMTPFatalError)
app/models/channel/driver/smtp.rb:93:in `send'
app/models/channel.rb:257:in `deliver'
lib/notification_factory/mailer.rb:170:in `send'
lib/notification_factory/mailer.rb:223:in `notification'
app/models/transaction/notification.rb:216:in `block in perform'
app/models/transaction/notification.rb:105:in `each'
app/models/transaction/notification.rb:105:in `perform'
lib/transaction_dispatcher.rb:61:in `execute_single_backend'
app/jobs/transaction_job.rb:25:in `block in perform'
app/jobs/transaction_job.rb:21:in `perform'
lib/background_services/service/process_delayed_jobs.rb:25:in `block (2 levels) in launch'
lib/background_services/service/process_delayed_jobs.rb:23:in `block in launch'
lib/background_services/service/process_delayed_jobs.rb:20:in `loop'
lib/background_services/service/process_delayed_jobs.rb:20:in `launch'
lib/background_services/service.rb:27:in `block in run'
lib/application_handle_info.rb:19:in `use'
lib/background_services/service.rb:33:in `block in run_in_service_context'
lib/background_services/service.rb:32:in `run_in_service_context'
lib/background_services/service.rb:26:in `run'
lib/background_services.rb:67:in `block in start_as_thread'
E, [2023-02-17T12:10:29.774215 #1656135-112080] ERROR -- : Can't use Channel::Driver::Smtp: #<Net::SMTPFatalError: 550-Callout verification failed:
> (RuntimeError)
app/models/channel.rb:270:in `rescue in deliver'
app/models/channel.rb:245:in `deliver'
lib/notification_factory/mailer.rb:170:in `send'
lib/notification_factory/mailer.rb:223:in `notification'
app/models/transaction/notification.rb:216:in `block in perform'
app/models/transaction/notification.rb:105:in `each'
app/models/transaction/notification.rb:105:in `perform'
lib/transaction_dispatcher.rb:61:in `execute_single_backend'
app/jobs/transaction_job.rb:25:in `block in perform'
app/jobs/transaction_job.rb:21:in `perform'
lib/background_services/service/process_delayed_jobs.rb:25:in `block (2 levels) in launch'
lib/background_services/service/process_delayed_jobs.rb:23:in `block in launch'
lib/background_services/service/process_delayed_jobs.rb:20:in `loop'
lib/background_services/service/process_delayed_jobs.rb:20:in `launch'
lib/background_services/service.rb:27:in `block in run'
lib/application_handle_info.rb:19:in `use'
lib/background_services/service.rb:33:in `block in run_in_service_context'
lib/background_services/service.rb:32:in `run_in_service_context'
lib/background_services/service.rb:26:in `run'
lib/background_services.rb:67:in `block in start_as_thread'

Zammad Monitoring status:

Zammad E Channel status (redacted):

SMTP log on relayhost (redacted). The relayhost receives an error from a remote MX, which is probably some sort of forwarding service for this specific agentā€™s emailaddress (ZZZ). The XXX is our own SMTP relayhost, YYY is the remote MX:

Feb 17 12:10:29 relay.XXXXXXXX.XXX exim[20764]: 2023-02-17 12:10:29 H=(XXXXXXXXXX) [XXX.XXX.XXX.XXX] X=TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256 CV=no F=<helpdesk@example.com> A=login:XXXXXXXXX rejected RCPT <ZZZZZZZZ>: mail.YYYYYYY.YYY [YYY.YYY.YYY.YYY] : SMTP error from remote mail server after RCPT TO:<ZZZZZZZZ>: 550 5.1.0 <helpdesk@example.com>: Sender address rejected: User unknown in virtual mailbox table

Channels are self healing. As your notification channel is affected, the issue will persist as long as Zammad tries to send mail to the problem agent. As soon as you solve that (whatever it is) and the next email notification is being send, the channel automatically switches back to ā€œOKā€ / green and will disappear from your monitoring.

The 550 is not a Zammad code and comes from your mail server and out of Zammadā€™s control. The mail transactions (apart from that one affected) should still send normally as expected.

I understand that. What I donā€™t understand is that a single error on one of the agent email addresses causes a fatal error on the notifications sender, which in turn means no further pending notifications about a ticket-update are sent to the rest of the agents. The entire process is aborted. I would have expected the notifications sender to just skip to the next agent email address to notify, instead of aborting/crashing entirely.

No, they do not. That is what I would have expected too, whenever the affected agent is part of the group of agents to be notified of a ticket update, the notifications sender aborts all further pending notifications when it gets to the affected agent and receives an error from the mailserver.

I think Zammad treats the mailserver error as a generic fatal server error, while it is in fact just a fatal error on a single recipient.

That is impossible.
Zammad sends each agent mail one by one (as it does with every mail).
Are your other agents really not receiving any mail while having it configuredā€¦?

If nobody receives mail, it would mean that every single mail Zammad tries to send via that channel is being rejected by either your relay or the receiving server.

Your production log might be gold in that scenario.
Code wise Iā€™m very sure that your described behavior is not a thing, but as youā€™re saying youā€™re actually seeing it happen, the log would be the proof most likely.
That being said - your instance would be the first ever Iā€™ve seen to behave like that.

Another side noteā€¦ possibly have a look into your users preferences (User.find_by(email: ā€˜ā€™).preferences) and check if thereā€™s mail_delivery_failed: true set. If thatā€™s the case, the affected user shouldnā€™t receive mail at all.

Yet here we are.

Trust me, Iā€™m quite sure of this. Iā€™ve even reproduced it for you to see for yourself (see further down below).

No. Itā€™s not that nobody receives mail, itā€™s that everybody receives email up until it reaches the agent with issues. The error causes the notifications sender to crash and it seems to abort the rest of the queue for pending notifications.

The log shows the same error as Iā€™ve posted in the start of the topic.

I understand that, itā€™s probably less likely that people will both have issues with their email account and have notifications enabled via email. I have tried to explain it in the following reproduction.

Reproduction

Reference starting point

A Zammad instance with multiple agents, some of whom have email notifications configured.
In the meantime during the start of this topic and this update, we have updated Zammad to version 5.3.1-1678258494.e38c655f.focal, and this test is done on that version.
When a new ticket is created by a customer sending an email to the helpdesk, the following happens regarding notifications:

The bart user is a fictional account, created for this test. It has deliberately been created with a letter early in the alphabet, as notifications seem to be sent in alphabetical order. The dennis account is my own Zammad useraccount.

Simulating the error conditions

Adjust the email address of one of the agents (the bart account in this example) to an address which doesnā€™t exist on an external mailserver. This user must have email notifications configured in order to trigger the error. Let the customer update the ticket by sending an email. The following happens regarding the notifications:

Notice how the notifications stop at the bart account (with the new non-existant email address).

Reverting the error conditions

Then, after changing the email address back to a known good address, and updating the ticket, the following happens regarding the notifications:

I have received email notifications of the first and the third ticket update. I did not receive a notification about the second ticket update.

Logging

Anonymized Zammad production.log during the simulated error condition, from the start of the IMAP fetch of the email up to the error during the notifications sender:

I, [2023-03-16T12:54:08.928311 #1016278-112800]  INFO -- : fetching imap (imap.example.com/username@example.com port=993,ssl=true,starttls=false,folder=INBOX,keep_on_server=false,auth_type=LOGIN)
I, [2023-03-16T12:54:09.111174 #1016278-112800]  INFO -- :  - message 1/1
I, [2023-03-16T12:54:09.200593 #1016278-112800]  INFO -- : Process email with msgid '<12345678-1234-1234-1234-1234567890@gmail.com>'
I, [2023-03-16T12:54:09.219278 #1016278-112800]  INFO -- :   matching content 'subject' contains TESTTICKET
I, [2023-03-16T12:54:09.228619 #1016278-112800]  INFO -- : set_attributes_by_x_headers header x-zammad-article-preferences found. Assign preferences={"send-auto-response"=>true, "is-auto-response"=>false}
I, [2023-03-16T12:54:09.368808 #1016278-112800]  INFO -- : ended Channel.fetch took: 0.680957435 seconds.
I, [2023-03-16T12:54:10.134020 #523-15832760]  INFO -- : Started GET "/api/v1/tickets/1330?full=true&_=1678967257026" for 2aaa:aaaa:aaaa:a:aaaa:aaaa:aaa:aaaa at 2023-03-16 12:54:10 +0100
I, [2023-03-16T12:54:10.138714 #523-15832760]  INFO -- : Processing by TicketsController#show as JSON
I, [2023-03-16T12:54:10.138807 #523-15832760]  INFO -- :   Parameters: {"full"=>"true", "_"=>"1678967257026", "id"=>"1330"}
I, [2023-03-16T12:54:10.165504 #523-15832760]  INFO -- : Completed 200 OK in 27ms (Views: 1.4ms | ActiveRecord: 5.2ms | Allocations: 14836)
I, [2023-03-16T12:54:10.509511 #523-15727360]  INFO -- : Started GET "/api/v1/tickets/1330?all=true&auto_assign=true&_=1678967257027" for 2aaa:aaaa:aaaa:a:aaaa:aaaa:aaa:aaaa at 2023-03-16 12:54:10 +0100
I, [2023-03-16T12:54:10.513840 #523-15727360]  INFO -- : Processing by TicketsController#show as JSON
I, [2023-03-16T12:54:10.513885 #523-15727360]  INFO -- :   Parameters: {"all"=>"true", "auto_assign"=>"true", "_"=>"1678967257027", "id"=>"1330"}
I, [2023-03-16T12:54:10.736109 #523-15727360]  INFO -- : Completed 200 OK in 222ms (Views: 10.7ms | ActiveRecord: 14.1ms | Allocations: 169093)
I, [2023-03-16T12:54:11.078613 #523-15726760]  INFO -- : Started POST "/api/v1/integration/smime" for 2aaa:aaaa:aaaa:a:aaaa:aaaa:aaa:aaaa at 2023-03-16 12:54:11 +0100
I, [2023-03-16T12:54:11.084817 #523-15726760]  INFO -- : Processing by Integration::SMIMEController#search as JSON
I, [2023-03-16T12:54:11.084892 #523-15726760]  INFO -- :   Parameters: {"ticket"=>{"group_id"=>"1", "owner_id"=>"", "state_id"=>"1", "priority_id"=>"2", "xxxxxxxxx"=>"", "xxxxxxxxxx"=>"", "xxxxxxxxx"=>"", "xxxxxxxxx"=>"", "xxxxxxxxxxxxx"=>[]}, "article"=>{"body"=>"[FILTERED]", "type"=>"note", "internal"=>true, "form_id"=>"967544056", "shared_draft_id"=>"", "subtype"=>"", "in_reply_to"=>"", "to"=>"", "cc"=>"", "subject"=>""}}
I, [2023-03-16T12:54:11.097857 #523-15726760]  INFO -- : Completed 200 OK in 13ms (Views: 0.2ms | ActiveRecord: 2.9ms | Allocations: 6424)
I, [2023-03-16T12:54:11.421352 #1016278-112020]  INFO -- : ProcessScheduledJobs running...
I, [2023-03-16T12:54:11.423485 #1016278-112020]  INFO -- : Running job thread for 'Process ticket escalations.' (Ticket.process_escalation) status is: sleep
I, [2023-03-16T12:54:11.423704 #1016278-112020]  INFO -- : Running job thread for 'Generate 'Session' data.' (Sessions.jobs) status is: sleep
I, [2023-03-16T12:54:11.423730 #1016278-112020]  INFO -- : Running job thread for 'Check channels.' (Channel.fetch) status is: sleep
I, [2023-03-16T12:54:11.423748 #1016278-112020]  INFO -- : Running job thread for 'Check 'Channel' streams.' (Channel.stream) status is: sleep
I, [2023-03-16T12:54:11.424014 #1016278-112020]  INFO -- : Running job thread for 'Execute planned jobs.' (Job.run) status is: sleep
I, [2023-03-16T12:54:12.277405 #1016278-111980]  INFO -- : 2023-03-16T12:54:12+0100: [Worker(host:helpdesk.example.com pid:1016278)] Job TransactionJob [fd8b3789-0879-4b59-9661-3d0651eae6ba] from DelayedJob(default) with arguments: [{"object"=>"Ticket", "object_id"=>1330, "user_id"=>180, "created_at"=>{"_aj_serialized"=>"ActiveJob::Serializers::TimeWithZoneSerializer", "value"=>"2023-03-16T11:54:09.248224974Z"}, "type"=>"update", "article_id"=>4183, "_aj_symbol_keys"=>["object", "object_id", "user_id", "created_at", "type", "article_id"]}, {"interface_handle"=>"scheduler", "disable"=>[], "_aj_symbol_keys"=>["interface_handle", "disable"]}] (id=3320203) (queue=default) RUNNING
I, [2023-03-16T12:54:12.891533 #523-15601600]  INFO -- : Started PUT "/api/v1/taskbar/4856" for 2aaa:aaaa:aaaa:a:aaaa:aaaa:aaa:aaaa at 2023-03-16 12:54:12 +0100
I, [2023-03-16T12:54:12.896534 #523-15601600]  INFO -- : Processing by TaskbarController#update as JSON
I, [2023-03-16T12:54:12.896617 #523-15601600]  INFO -- :   Parameters: {"key"=>"Ticket-1330", "client_id"=>"123", "callback"=>"TicketZoom", "state"=>{"ticket"=>{}, "article"=>{"form_id"=>"967544056"}}, "params"=>{"ticket_id"=>"1330", "article_id"=>"4182", "shown"=>true}, "prio"=>7, "notify"=>true, "active"=>true, "attachments"=>[], "updated_at"=>"2023-03-16T11:51:53.513Z", "id"=>"4856"}
I, [2023-03-16T12:54:12.918715 #523-15601600]  INFO -- : Completed 200 OK in 22ms (Views: 0.3ms | ActiveRecord: 3.1ms | Allocations: 11602)
E, [2023-03-16T12:54:13.939089 #1016278-111980] ERROR -- : Can't use Channel::Driver::Smtp: #<Net::SMTPFatalError: 550-Callout verification failed:
>
E, [2023-03-16T12:54:13.939312 #1016278-111980] ERROR -- : 550-Callout verification failed:
 (Net::SMTPFatalError)
app/models/channel/driver/smtp.rb:93:in `send'
app/models/channel.rb:257:in `deliver'
lib/notification_factory/mailer.rb:170:in `send'
lib/notification_factory/mailer.rb:223:in `notification'
app/models/transaction/notification.rb:216:in `block in perform'
app/models/transaction/notification.rb:105:in `each'
app/models/transaction/notification.rb:105:in `perform'
lib/transaction_dispatcher.rb:61:in `execute_single_backend'
app/jobs/transaction_job.rb:25:in `block in perform'
app/jobs/transaction_job.rb:21:in `perform'
lib/background_services/service/process_delayed_jobs.rb:25:in `block (2 levels) in launch'
lib/background_services/service/process_delayed_jobs.rb:23:in `block in launch'
lib/background_services/service/process_delayed_jobs.rb:20:in `loop'
lib/background_services/service/process_delayed_jobs.rb:20:in `launch'
lib/background_services/service.rb:27:in `block in run'
lib/application_handle_info.rb:19:in `use'
lib/background_services/service.rb:33:in `block in run_in_service_context'
lib/background_services/service.rb:32:in `run_in_service_context'
lib/background_services/service.rb:26:in `run'
lib/background_services.rb:67:in `block in start_as_thread'
E, [2023-03-16T12:54:13.944715 #1016278-111980] ERROR -- : Can't use Channel::Driver::Smtp: #<Net::SMTPFatalError: 550-Callout verification failed:
> (RuntimeError)
app/models/channel.rb:270:in `rescue in deliver'
app/models/channel.rb:245:in `deliver'
lib/notification_factory/mailer.rb:170:in `send'
lib/notification_factory/mailer.rb:223:in `notification'
app/models/transaction/notification.rb:216:in `block in perform'
app/models/transaction/notification.rb:105:in `each'
app/models/transaction/notification.rb:105:in `perform'
lib/transaction_dispatcher.rb:61:in `execute_single_backend'
app/jobs/transaction_job.rb:25:in `block in perform'
app/jobs/transaction_job.rb:21:in `perform'
lib/background_services/service/process_delayed_jobs.rb:25:in `block (2 levels) in launch'
lib/background_services/service/process_delayed_jobs.rb:23:in `block in launch'
lib/background_services/service/process_delayed_jobs.rb:20:in `loop'
lib/background_services/service/process_delayed_jobs.rb:20:in `launch'
lib/background_services/service.rb:27:in `block in run'
lib/application_handle_info.rb:19:in `use'
lib/background_services/service.rb:33:in `block in run_in_service_context'
lib/background_services/service.rb:32:in `run_in_service_context'
lib/background_services/service.rb:26:in `run'
lib/background_services.rb:67:in `block in start_as_thread'
I, [2023-03-16T12:54:13.946244 #1016278-111980]  INFO -- : 2023-03-16T12:54:13+0100: [Worker(host:helpdesk.example.com pid:1016278)] Job TransactionJob [fd8b3789-0879-4b59-9661-3d0651eae6ba] from DelayedJob(default) with arguments: [{"object"=>"Ticket", "object_id"=>1330, "user_id"=>180, "created_at"=>{"_aj_serialized"=>"ActiveJob::Serializers::TimeWithZoneSerializer", "value"=>"2023-03-16T11:54:09.248224974Z"}, "type"=>"update", "article_id"=>4183, "_aj_symbol_keys"=>["object", "object_id", "user_id", "created_at", "type", "article_id"]}, {"interface_handle"=>"scheduler", "disable"=>[], "_aj_symbol_keys"=>["interface_handle", "disable"]}] (id=3320203) (queue=default) COMPLETED after 1.6687
I, [2023-03-16T12:54:13.947954 #1016278-111980]  INFO -- : 2023-03-16T12:54:13+0100: [Worker(host:helpdesk.example.com pid:1016278)] Job TicketUserTicketCounterJob [5bb25695-68da-4cc8-af7c-49fa3555cd17] from DelayedJob(default) with arguments: [180, 180] (id=3320204) (queue=default) RUNNING

Look what I found. Seems like I was wrong, thatā€™s your issue, isnā€™t it?

1 Like

Oh, that appears to be the exact issue indeed. I searched the forum for this problem before posting it, after not finding relevant results, but I didnā€™t check the open issues on GitHub, sorry about that :sweat_smile:

Not your fault. I commented on the issue years ago and didnā€™t remember. Shame on me :stuck_out_tongue_winking_eye:
Iā€™ll see if I can get some clamps on that one.

1 Like

This topic was automatically closed 360 days after the last reply. New replies are no longer allowed.