Title: Make AI Provider timeout configurable
- What is your original issue/pain point you want to solve?
I am using a local model and many requests fail because they seem to run into a timeout.
-
Which are one or two concrete situations where this problem hurts the most?
-
Why is it not solvable with the Zammad standard?
it seems to be hardcoded:
[2] pry(main)> puts File.read('/opt/zammad/lib/ai/provider/custom_open_ai.rb')
# Copyright (C) 2012-2026 Zammad Foundation, https://zammad-foundation.org/
class AI::Provider::CustomOpenAI < AI::Provider
include AI::Provider::Concerns::HandlesOpenAIMessages
include AI::Provider::Concerns::HasConfigurableModel
DEFAULT_OPTIONS = {
temperature: 0.1,
}.freeze
def chat(prompt_system:, prompt_user:, prompt_image:)
request_body = {
model: model_for(prompt_image:),
messages: messages_for(prompt_system:, prompt_user:, prompt_image:),
stream: false,
store: false,
}
request_body[:temperature] = options[:temperature]
request_options = {
open_timeout: 4,
read_timeout: 60,
verify_ssl: true,
total_timeout: 60,
json: true,
log: {
facility: 'AI::Provider',
},
}
# Token is optional since target host might not require authentication
request_options[:bearer_token] = config[:token] if config[:token].present?
response = UserAgent.post(
"#{config[:url]}/chat/completions",
request_body,
request_options,
)
data = validate_response!(response)
extract_response_metadata(data)
data['choices'].first['message']['content']
end
def embeddings(input:)
raise NotImplementedError, 'not supported for custom OpenAI Compatible providers'
end
def self.ping!(config)
request_options = {
open_timeout: 4,
read_timeout: 60,
verify_ssl: true,
total_timeout: 60,
json: true,
log: {
facility: 'AI::Provider',
log_only_on_error: true,
},
}
# Token is optional since target host might not require authentication
request_options[:bearer_token] = config[:token] if config[:token].present?
response = UserAgent.get(
"#{config[:url]}/models",
{},
request_options,
)
validate_response!(response)
nil
end
private
def specific_metadata
{
model: options[:model],
}
end
def extract_response_metadata(data)
@response_metadata = {
prompt_tokens: data.dig('usage', 'prompt_tokens'),
completion_tokens: data.dig('usage', 'completion_tokens'),
total_tokens: data.dig('usage', 'total_tokens'),
}
end
end
=> nil
- What is your expectation/what do you want to achieve?
I want to be able to use slower local models (for summarizing the response can be slow)
If there is any more useful information, feel free to share it all (e.g.: mockup screenshots, if something is UI related, or the API URL/documentation URL for a service you need a connection to).
Your Zammad environment:
- Average concurrent agent count: 10
- Average tickets a day: 20
- What roles/people are involved:
Anything else which you think is useful to understand your use case:
Thank you and have fun.