Infos:
- Used Zammad version: 7.0.1-1780384166.e917c72c.noble
- Used Zammad installation type: package
- Operating system: Ubuntu 26.04
- Browser + version: (not relevant — server-side fetch behaviour)
Disclaimer: The text except this one is designed via AI, because I asked it before with the expection to solve it myself via editing a limit in Entra Portal. It turns out that the problem is more technical than my abilities can cover. Plus I’m german and I can draft a text like this disclaimer section but I don’t know if I could write it “understandable” for you guys. The production.log is already anonymized, but the AI changed some values again in anonymous values. I hope this text contains all the information what you need. If not, don’t hesitate to ask for more.
This installation was setup a week ago. It’s a fresh installation of Ubuntu and Zammad were everything is fine except the Microsoft Graph channel. I tried Microsoft Graph IMAP but there is the SMTP issue and I don’t want to disable the security standards in Entra Portal to activate SMTP.
Log-File: production.log
Expected behavior
When the Microsoft Graph inbound channel receives a 503 Service Unavailable
from Microsoft, the channel should apply a back-off and increase the delay
between fetch attempts (ideally honouring Retry-After, or otherwise an
exponentially increasing delay), so the overloaded mailbox backend can recover.
Actual behavior
The channel keeps retrying at the normal ~30s scheduler interval with no
back-off at all, even while every single fetch is rejected with a 503.
The 503 returned by Microsoft in this case is not a regular Graph throttling
response. It is a raw HTML error page produced by the IIS/ARR layer in front of
the mailbox backend, with Content-Type: text/html and no Retry-After
header and no JSON error code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Service Unavailable</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Application Request Queue Full</h2>
<hr><p>HTTP Error 503. The application request queue is full.</p>
</BODY></HTML>
(503)
Because this response carries no Retry-After and no Graph error code, it looks
like the back-off logic does not recognise it as throttling and treats it as an
ordinary transient error. The fetch therefore keeps firing at the regular
interval, which keeps the mailbox backend’s request queue full and prevents it
from recovering on its own. In our case the channel never recovers until it is
manually paused.
Relevant log excerpt (anonymised)
Note the fetch timestamps: 10:03:24, 10:03:53, 10:04:25 — a steady ~29–32s
interval with no increase between consecutive 503s. Each attempt is rejected in
~0.1s.
I, [2026-06-02T10:03:24] Job ChannelFetchJob [...] (queue=communication_inbound) RUNNING
I, [2026-06-02T10:03:25] fetching Microsoft Graph (zammad@example.tld keep_on_server=false)
E, [2026-06-02T10:03:25] ERROR -- : Unable to list emails from Microsoft Graph server (zammad@example.tld): #<MicrosoftGraph::ApiError: "...<h2>Application Request Queue Full</h2>...HTTP Error 503. The application request queue is full...(503)">
lib/microsoft_graph.rb:141:in 'MicrosoftGraph#handle_error!'
lib/microsoft_graph.rb:127:in 'MicrosoftGraph#make_request'
lib/microsoft_graph.rb:180:in 'MicrosoftGraph#make_paginated_request'
lib/microsoft_graph.rb:40:in 'MicrosoftGraph#list_messages'
app/models/channel/driver/microsoft_graph_inbound.rb:141:in '#messages_iterator'
...
I, [2026-06-02T10:03:53] Job ChannelFetchJob [...] RUNNING
E, [2026-06-02T10:03:53] ERROR -- : ... Application Request Queue Full ... (503)
I, [2026-06-02T10:04:25] Job ChannelFetchJob [...] RUNNING
E, [2026-06-02T10:04:25] ERROR -- : ... Application Request Queue Full ... (503)
Steps to reproduce the behavior
- Connect a Microsoft 365 mailbox via the Graph Email channel.
- Have the mailbox backend return the HTML
503 Application Request Queue Full
page (occurs when the per-mailbox backend request queue is saturated). - Observe in
production.logthat fetch attempts continue at the regular
scheduler interval with no increasing delay between attempts.
Possible cause / suggestion
It looks like MicrosoftGraph#handle_error! raises a generic ApiError for this
HTML 503 without classifying it as a rate-limit/throttling condition, so the
back-off path that was introduced for #5565 is not reached. It might help to
treat any 503 from the Graph endpoint as a back-off trigger (with a sane
default delay when Retry-After is absent), regardless of whether the body is
JSON or an IIS/ARR HTML page.
Related: #5565 (Improve rate limit handling of M365 Graph Email Channel).
Thank you for the assistance
(this is from me and not from the AI)
Sebastian