Upgrading Zammad from version 3.2 to version 3.4

robert_muehlbauer · August 4, 2020, 4:13pm

Used Zammad version: Upgrading from 3.2 to 3.4
Used Zammad installation source: Docker (zammad/zammad-docker-compose on hub.docker.com)
Operating system: n.a.
Browser + version: n.a

Expected behavior:

I want to migrate/upgrade a running zammad 3.2 installation to zammad 3.4.
I’m using the “official” zammad docker containers.
Database is PostgreSQL (version 10.2 for zammad 3.2 and 11.7 for zammad 3.4)
Zammad 3.2 is a production environment and zammad 3.4 is running in parallel in a different environment.

Basically I’m trying to copy/migrate the data from the v3.2 environment to the v3.4 one.

I want all my data to be on the 3.4 system after migration

Actual behavior:

After importing a database dump from the v3.2 system on the v3.4 system (only version A was working - see below) the systems starts to some kind of “degenerate” - especially when I do changes to users/groups/roles I’m suddenly not allowed to see my tickets anymore, cant assign tickets to agents, users are not visible anymore at all and many more strange things happen.

Steps to reproduce the behavior:

As I couldn’t find any detailed informations on how to migrate zammad data to a upgraded version on another host, I’ve tried 2 slightly different approaches:

created a db dump on the v3.2 version (postgresql 10.6 - zammad app server was stopped to do so)
imported the dump on the v3.4 version (postgresql 11.7 - zammad app server was stopped to do so)
version A: I’ve created a completely empty database to import the dump to - importing the dump worked without any problems
version B: I’ve tried to import the dump into the database that was created by zammad app server during initial startup - this produces a lot of errors and didn’t really work
successfully started the zammad app server again
Login and just navigate around a bit and make changes to users/groups/roles (edit users, or create new ones)

MrGeneration · August 14, 2020, 8:26am

Honestly it’s hard to tell what went wrong.
Did you ensure the migrations did run successfully?

Did you maybe degrade your user rights of the logged in account by accident?
This could also happen if you change permissions on roles which you are member of.

Usually and in theory you should never see less after a migration without changing configurations.

robert_muehlbauer · August 17, 2020, 12:32pm

I think the migrations were all started and finished successfully:

I, [2020-08-17T08:52:17.471822 #11-47384871344480]  INFO -- : Setting.set('models_searchable', ["Organization", "Ticket", "KnowledgeBase::Answer::Translation", "User", "Chat::Session"])
I, [2020-08-17T08:52:18.230444 #11-47384871344480]  INFO -- : Migrating to Issue2641KbColorChangeLimit (20190717210244)
== 20190717210244 Issue2641KbColorChangeLimit: migrating ======================
-- change_column(:knowledge_bases, :color_highlight, :string, {:limit=>25})
   -> 0.0044s
-- change_column(:knowledge_bases, :color_header, :string, {:limit=>25})
   -> 0.0014s
== 20190717210244 Issue2641KbColorChangeLimit: migrated (0.0102s) =============
I, [2020-08-17T08:52:18.276319 #11-47384871344480]  INFO -- : Migrating to Issue2867FooterHeaderPublicLink (20190918114553)
== 20190918114553 Issue2867FooterHeaderPublicLink: migrating ==================
-- add_column(:knowledge_base_menu_items, :location, :string, {:null=>false, :default=>"header"})
   -> 0.0044s
-- add_index(:knowledge_base_menu_items, :location)
   -> 0.0682s
-- change_column_default(:knowledge_base_menu_items, :location, nil)
   -> 0.0588s
== 20190918114553 Issue2867FooterHeaderPublicLink: migrated (0.1339s) =========

I, [2020-08-17T08:52:18.417905 #11-47384871344480]  INFO -- : Migrating to Issue2715FixBrokenTwitterUrls (20191107181428)
== 20191107181428 Issue2715FixBrokenTwitterUrls: migrating ====================
I, [2020-08-17T08:52:18.477810 #11-47384871344480]  INFO -- : Enqueued Issue2715FixBrokenTwitterUrlsJob (Job ID: a4988713-855f-4b2f-9ca4-e7cfb54640a4) to DelayedJob(default)
== 20191107181428 Issue2715FixBrokenTwitterUrls: migrated (0.0428s) ===========

I, [2020-08-17T08:52:18.481959 #11-47384871344480]  INFO -- : Migrating to ActiveJobLockCleanupJobScheduler (20191129102720)
== 20191129102720 ActiveJobLockCleanupJobScheduler: migrating =================
== 20191129102720 ActiveJobLockCleanupJobScheduler: migrated (0.0114s) ========

I, [2020-08-17T08:52:18.502623 #11-47384871344480]  INFO -- : Migrating to SMIMESupport (20200121000001)
== 20200121000001 SMIMESupport: migrating =====================================
-- create_table(:smime_certificates)
   -> 0.1202s
-- add_index(:smime_certificates, [:fingerprint], {:unique=>true})
   -> 0.0093s
-- add_index(:smime_certificates, [:modulus])
   -> 0.0080s
-- add_index(:smime_certificates, [:subject])
   -> 0.0076s
== 20200121000001 SMIMESupport: migrated (0.1990s) ============================

I, [2020-08-17T08:52:18.718257 #11-47384871344480]  INFO -- : Migrating to ChatAddAllowWebsite (20200205000001)
== 20200205000001 ChatAddAllowWebsite: migrating ==============================
-- add_column(:chats, :whitelisted_websites, :string, {:limit=>5000, :null=>true})
   -> 0.0027s
== 20200205000001 ChatAddAllowWebsite: migrated (0.0045s) =====================

I, [2020-08-17T08:52:18.730214 #11-47384871344480]  INFO -- : Migrating to ServiceNowConfig (20200401000001)
== 20200401000001 ServiceNowConfig: migrating =================================
== 20200401000001 ServiceNowConfig: migrated (0.0635s) ========================

I, [2020-08-17T08:52:18.805800 #11-47384871344480]  INFO -- : Migrating to Issue2990DeleteTimeframe (20200413160113)
== 20200413160113 Issue2990DeleteTimeframe: migrating =========================
== 20200413160113 Issue2990DeleteTimeframe: migrated (0.0181s) ================

I, [2020-08-17T08:52:18.835192 #11-47384871344480]  INFO -- : Migrating to SettingWebsocketBackend (20200419204445)
== 20200419204445 SettingWebsocketBackend: migrating ==========================
== 20200419204445 SettingWebsocketBackend: migrated (0.0163s) =================

I, [2020-08-17T08:52:18.861391 #11-47384871344480]  INFO -- : Migrating to ImapAuthenticationMigrationCleanupJobScheduler (20200507095900)
== 20200507095900 ImapAuthenticationMigrationCleanupJobScheduler: migrating ===
== 20200507095900 ImapAuthenticationMigrationCleanupJobScheduler: migrated (0.0143s)

I, [2020-08-17T08:52:18.884820 #11-47384871344480]  INFO -- : Migrating to Issue3085DoorkeeperScopes (20200615150955)
== 20200615150955 Issue3085DoorkeeperScopes: migrating ========================
== 20200615150955 Issue3085DoorkeeperScopes: migrated (0.0169s) ===============

I, [2020-08-17T08:52:18.908852 #11-47384871344480]  INFO -- : Migrating to Issue3087SearchTaskbarDeadlock (20200617153806)
== 20200617153806 Issue3087SearchTaskbarDeadlock: migrating ===================
== 20200617153806 Issue3087SearchTaskbarDeadlock: migrated (0.0303s) ==========

I, [2020-08-17T08:52:18.955842 #11-47384871344480]  INFO -- : Migrating to Issue3110ServiceNowProvider (20200709094556)
== 20200709094556 Issue3110ServiceNowProvider: migrating ======================
== 20200709094556 Issue3110ServiceNowProvider: migrated (0.0060s) =============

I, [2020-08-17T08:52:18.970216 #11-47384871344480]  INFO -- : Migrating to Issue3123ExternalSyncTicketMerge (20200716124141)
== 20200716124141 Issue3123ExternalSyncTicketMerge: migrating =================
== 20200716124141 Issue3123ExternalSyncTicketMerge: migrated (0.0160s) ========

While playing around a lot in the meantime (tried like 30 times to import the dbdump from our prod system), I’ve found the following:

when I only import the db dump, do the migrations and start the app server again the systems start to become “unstable” in 100% of my tries - that is you cant assign ticket, cant do any changes to users/roles, you cant see your tickets anymore, etc…and it seems to getting worse with every click you make.
when I import the db dump, do the migrations AND execute “rails zammad:db:init” (see output below) everything seems to be ok again. I dont know what exactly happens but the db:init seems to somehow solve the problem … strange.

I’ve tried it many times and I can reproduce it in 100% of the cases - after running “rails zammad:db:init” the system ist stable and without any issues.
When I dont issue the “rails zammad:db:init” command, i starts quite soon to somehow “degenerate” and become unusable.

rails zammad:db:init
I, [2020-08-17T09:32:34.588384 #20-47035630770540] INFO – : Setting.set(‘models_searchable’, [“Organization”, “Ticket”, “KnowledgeBase::Answer::Translation”, “User”, “Chat::Session”])
Database ‘zammad_prod’ already exists
I, [2020-08-17T09:32:36.265954 #20-47035630770540] INFO – : Enqueued SearchIndexJob (Job ID: 9f812cc7-7c88-45a3-b5e6-72606e261f08) to DelayedJob(default) with arguments: “User”, 2
I, [2020-08-17T09:32:36.276299 #20-47035630770540] INFO – : Enqueued SearchIndexAssociationsJob (Job ID: beb88dfa-5e6d-4ac9-a63b-19125d05ce08) to DelayedJob(default) with arguments: “User”, 2
I, [2020-08-17T09:33:47.525076 #20-47035630770540] INFO – : Enqueued SlaTicketRebuildEscalationJob (Job ID: 438e33ec-3e24-4196-8cea-8071aeeeaa6f) to DelayedJob(default)
I, [2020-08-17T09:33:47.571138 #20-47035630770540] INFO – : Won’t enqueue SlaTicketRebuildEscalationJob (Job ID: 97df5146-e380-4a79-a84d-b5a2f4cad4a0) because of already existing job with lock key ‘SlaTicketRebuildEscalationJob’.
I, [2020-08-17T09:33:47.573516 #20-47035630770540] INFO – : Enqueued SlaTicketRebuildEscalationJob (Job ID: 97df5146-e380-4a79-a84d-b5a2f4cad4a0) to DelayedJob(default)

To me this seems like there is maybe a missing migration oder something?
I’ve also made a diff on the zammad 3.2 db schema vs. zammad 3.4 db schema but could not find any difference here.

For the moment the “rails zammad:db:init” seems to solve my problem - although I’m still uncertain about what exactly was causing the problem.

MrGeneration · August 24, 2020, 9:03am

I honestly don’t have a clue what’s going wrong in your installation.
A db:init shouldn’t be needed at all on a database you’re importing.

Maybe the intial dump was kind of degraded or broken.
Maybe references were missing - don’t know.

robert_muehlbauer · August 25, 2020, 8:22am

yes, I agree - its really strange.

The odd thing is, I can reproduce this strange behaviour as I like. Because of a lot of testing and playing around, I created a lot of db dumps and imported this data to a new “out of the box” zammad installation…the result - depending on if I rund db:init or not - is everytime the same…

MrGeneration · August 26, 2020, 8:34am

Did you test it with a “repaired” import?
Maybe migrational stuff went missing over time and caused this issue?

If you export a working imported installation and restore that, do you still have to run db:init ??

Only if you have the time though!

robert_muehlbauer · August 26, 2020, 12:10pm

hmm, thats a good question - only tried to get my upgrade scenario working, which was upgrading from 3.2 to 3.4 …never tried what happens when I restore a dump already created on zammad 3.4 - I will give that a go. I come back to you with the result, once I’ve tried it.

robert_muehlbauer · August 27, 2020, 11:05am

Ok, I’ve tried the dump/import thing between two zammad 3.4 installations - worked without and problems.
What I did:

created dump on first database “pg_dump -d database-to-dump -U postgres | gzip > output.gz”
imported dump file on second database “zcat output.gz | psql -d database-to-import-to -U postgres”

After Importing was done and the zammad appserver was up&running again (stopped it during db import) I was able to login to the new system and make changed to users/roles/groups without breaking the system.

So it really seems, that in upgrade migration procedure from 3.2 to 3.4 one tiny step is missing.
Furthermore db:init seems to correct this issue …

MrGeneration · August 27, 2020, 11:55am

Thank you so much for having a second look into this!
I’m glad that this doesn’t seem to be a general issue but may appear because of missing information within Zammads database (for whatever reason).

robert_muehlbauer · September 1, 2020, 12:05pm

you are welcome. Regarding the general issue, its all about your point of view, I suppose - on the one hand zammad itself does not seem to cause any kind of data corruption but on the other hand it looks like there is something missing in the db migrations that are automatically run.
For someone who knows all the details about db:init and migrations, it shouldn’t be too hard to identify the missing parts?

MrGeneration · September 7, 2020, 4:07pm

The different between “init” and “migrate” is that “init” also seeds the database:

github.com

zammad/zammad/blob/fae194918eb8a07036060a1b41f1b33c45e03cad/lib/tasks/zammad/db/init.rake

namespace :zammad do

  namespace :db do

    desc 'Initializes (creates, migrates and seeds) the DB'
    task init: %i[zammad:db:unseeded db:seed]
  end
end

vs

github.com

zammad/zammad/blob/fae194918eb8a07036060a1b41f1b33c45e03cad/lib/tasks/zammad/db/unseeded.rake

namespace :zammad do

  namespace :db do

    desc 'Creates and migrates the DB without seeding'
    task unseeded: %i[db:create db:migrate]
  end
end

Usually during a restoration you don’t want to seed information into the database.
However: It may come in handy if you have stuff missing (for whatever reason!).

So it’s all about the seeding here.

system · January 5, 2021, 4:07pm

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.