Hi there,
I’d like to contribute to Zammad, and I figure this makes a good starting off place.
Currently Zammad logs some sensitive information to the logs. We really want to tighten this down.
In particular, Zammad logs the IP address of the user on every request:
I, [2018-10-22T15:02:17.517447 #670] INFO -- : Started GET "/" for 11.22.33.44 at 2018-10-22 15:02:17 +0000
Also, on login the username is logged:
I, [2018-10-22T15:02:34.327476 #670] INFO -- : Processing by SessionsController#create as JSON
I, [2018-10-22T15:02:34.327673 #670] INFO -- : Parameters: {"username"=>"mytestuser@example.com", "password"=>"[FILTERED]", "fingerprint"=>"-1380143529"}
Another instance when PII can spill into the logs is when the POST to elastic search fails:
I, [2018-10-22T15:02:34.327476 #670] INFO -- : Processing by SessionsController#create as JSON
I, [2018-10-22T15:02:34.327673 #670] INFO -- : Parameters: {"username"=>"mytestuser@example.com", "password"=>"[FILTERED]", "fingerprint"=>"-1380143529"}
Another instance when PII can spill into the logs is when the POST to elastic search fails:
E, [2018-10-22T15:02:38.206232 #672] ERROR -- : 2018-10-22T15:02:38+0000: [Worker(host:li1381-89 pid:672)] Job BackgroundJobSearchIndex (id=20614) FAILED (0 prior attempts) with RuntimeError: Unable to process POST request to elasticsearch URL 'http://localhost:9200/zammad_production/User/37'. Check the response and payload for detailed information:
Response:
#<UserAgent::Result:0x00007f0aa65c7b70 @success=false, @body=nil, @data=nil, @code="400", @content_type=nil, @error="Client Error: #<Net::HTTPBadRequest 400 Bad Request readbody=true>!">
Payload:
{"id"=>37, "organization_id"=>nil, "login"=>"mytestuser@example.com", "firstname"=>"Abel", "lastname"=>"GP", "email"=>"mytestuser@example.com", "web"=>"", "phone"=>"", "fax"=>"", "mobile"=>"", "department"=>"", "street"=>"", "zip"=>"", "city"=>"", "country"=>"", "address"=>"", "vip"=>false, "verified"=>false, "active"=>true, "note"=>"", "last_login"=>Mon, 22 Oct 2018 15:02:34 UTC +00:00, "out_of_office"=>false, "out_of_office_start_at"=>nil, "out_of_office_end_at"=>nil, "out_of_office_replacement_id"=>nil, "preferences"=>{"notification_config"=>{"matrix"=>{"create"=>{"criteria"=>{"owned_by_me"=>true, "owned_by_nobody"=>true}, "channel"=>{"email"=>false, "online"=>true}}, "update"=>{"criteria"=>{"owned_by_me"=>true, "owned_by_nobody"=>true}, "channel"=>{"email"=>false, "online"=>true}}, "reminder_reached"=>{"criteria"=>{"owned_by_me"=>true}, "channel"=>{"email"=>false, "online"=>true}}, "escalation"=>{"criteria"=>{"owned_by_me"=>true}, "channel"=>{"email"=>false, "online"=>true}}}, "group_ids"=>["1"]}, "locale"=>"en-us", "intro"=>true, "notification_sound"=>{"file"=>"Xylo.mp3", "enabled"=>true}}, "updated_by_id"=>1, "created_by_id"=>3, "created_at"=>Wed, 20 Jun 2018 15:56:55 UTC +00:00, "updated_at"=>Mon, 22 Oct 2018 15:02:34 UTC +00:00, "nickname"=>"", "permissions"=>["admin", "admin.user", "admin.group", "admin.role", "admin.organization", "admin.overview", "admin.text_module", "admin.macro", "admin.tag", "admin.calendar", "admin.sla", "admin.scheduler", "admin.report_profile", "admin.channel_web", "admin.channel_formular", "admin.channel_email", "admin.channel_twitter", "admin.channel_facebook", "admin.channel_telegram", "admin.channel_chat", "admin.branding", "admin.setting_system", "admin.security", "admin.ticket", "admin.package", "admin.integration", "admin.api", "admin.object", "admin.translation", "admin.monitoring", "admin.maintenance", "admin.session", "user_preferences", "user_preferences.password", "user_preferences.notifications", "user_preferences.access_token", "user_preferences.language", "user_preferences.linked_accounts", "user_preferences.device", "user_preferences.avatar", "user_preferences.calendar", "user_preferences.out_of_office", "report", "ticket.agent", "chat.agent", "cti.agent"], "role_ids"=>[1, 2]}
Ideally this data should not be printed to the log at all, or there should be an option to suppress the info in the logs.
Another strategy other projects have taken, is to output all PII under a low-priority log level (such as debug or trace). Any output to INFO or higher is guaranteed to not have PII. One could achieve this by outputting the error message to the ERROR level and then the details (e.g., elasticsearch POST payload) to the DEBUG level.
Before I start writing code for this, does the development team have any input on what strategy to take to solve this?