Make AI timeout configurable

ruettinger · March 5, 2026, 7:29pm

Title: Make AI Provider timeout configurable

What is your original issue/pain point you want to solve?

I am using a local model and many requests fail because they seem to run into a timeout.

Which are one or two concrete situations where this problem hurts the most?
Why is it not solvable with the Zammad standard?

it seems to be hardcoded:


[2] pry(main)> puts File.read('/opt/zammad/lib/ai/provider/custom_open_ai.rb')
# Copyright (C) 2012-2026 Zammad Foundation, https://zammad-foundation.org/

class AI::Provider::CustomOpenAI < AI::Provider
  include AI::Provider::Concerns::HandlesOpenAIMessages
  include AI::Provider::Concerns::HasConfigurableModel

  DEFAULT_OPTIONS = {
    temperature: 0.1,
  }.freeze

  def chat(prompt_system:, prompt_user:, prompt_image:)
    request_body = {
      model:    model_for(prompt_image:),
      messages: messages_for(prompt_system:, prompt_user:, prompt_image:),
      stream:   false,
      store:    false,
    }

    request_body[:temperature] = options[:temperature]

    request_options = {
      open_timeout:  4,
      read_timeout:  60,
      verify_ssl:    true,
      total_timeout: 60,
      json:          true,
      log:           {
        facility: 'AI::Provider',
      },
    }

    # Token is optional since target host might not require authentication
    request_options[:bearer_token] = config[:token] if config[:token].present?

    response = UserAgent.post(
      "#{config[:url]}/chat/completions",
      request_body,
      request_options,
    )

    data = validate_response!(response)
    extract_response_metadata(data)

    data['choices'].first['message']['content']
  end

  def embeddings(input:)
    raise NotImplementedError, 'not supported for custom OpenAI Compatible providers'
  end

  def self.ping!(config)
    request_options = {
      open_timeout:  4,
      read_timeout:  60,
      verify_ssl:    true,
      total_timeout: 60,
      json:          true,
      log:           {
        facility:          'AI::Provider',
        log_only_on_error: true,
      },
    }

    # Token is optional since target host might not require authentication
    request_options[:bearer_token] = config[:token] if config[:token].present?

    response = UserAgent.get(
      "#{config[:url]}/models",
      {},
      request_options,
    )

    validate_response!(response)

    nil
  end

  private

  def specific_metadata
    {
      model: options[:model],
    }
  end

  def extract_response_metadata(data)
    @response_metadata = {
      prompt_tokens:     data.dig('usage', 'prompt_tokens'),
      completion_tokens: data.dig('usage', 'completion_tokens'),
      total_tokens:      data.dig('usage', 'total_tokens'),
    }
  end
end
=> nil

What is your expectation/what do you want to achieve?

I want to be able to use slower local models (for summarizing the response can be slow)

If there is any more useful information, feel free to share it all (e.g.: mockup screenshots, if something is UI related, or the API URL/documentation URL for a service you need a connection to).

Your Zammad environment:

Average concurrent agent count: 10
Average tickets a day: 20
What roles/people are involved:

Anything else which you think is useful to understand your use case:

Thank you and have fun.

ruettinger · March 5, 2026, 7:32pm

Maybe an environment variable would be good for this:

read_timeout: ENV.fetch(‘ZAMMAD_AI_READ_TIMEOUT’, 60).to_i,
total_timeout: ENV.fetch(‘ZAMMAD_AI_TOTAL_TIMEOUT’, 60).to_i,

dominikklein · March 5, 2026, 7:56pm

We also identified that the timeouts should be increased or more flexible from a configuration perspective. Let’s see what we will add as a short-term solution.

ruettinger · March 6, 2026, 9:49am

please do let me know once you find a way… otherwise I think the way AI is integrated is great.
(Being able to modify the already quite good prompts would be also welcome)

pReya · April 10, 2026, 8:05am

Another +1 here. Please implement this, it’s a much needed feature to use Zammad with self-hosted models.