Unbearably slow import times
Results 1 to 10 of 10

Thread: Unbearably slow import times

  1. #1
    Active Member
    Join Date
    Aug 2014
    Posts
    5

    Question Unbearably slow import times

    Hello guys,

    I am doing a trial of Zextras bundle and currently have a great deal of problems with the time it takes to import data from an exported backup. As an example, we have a test domain with 11 users containing just under 500k messages with the total exported size of around 25GB. It took around 2 hours to export the data.

    I've started importing the data over night on the same spec-ed virtual machine as the original Zimbra server which was used for exporting data. The specs are: 8vCPU and 8GB of ram. It has only done one account so far after about 8 hours of importing. Looking at the server, there are practically no disk activity, the utilisation of disks is 0% most of the time. I do see mysqld using somewhere between 60 and 80% of a single CPU. Overall the server's load is averaging 1.1 and iowait is around 0-5%.

    I did notice that /opt/zimbra/log/myslow.log is rather large - over 2GB in size.

    As a comparison, we have initiated a recovery procedure of one the accounts which is currently being imported on the original server. The recovery was done from the Zimbra NE backup. On the original Zimbra NE server it took just over an hour to recover the same mailbox, which was around 9.5gb and just over 100k messages.

    How do I speed up the import process? I've got a bunch of domains that I was planning to move across, containing over 1TB of data and over 200 users. My estimations show that at this pace it will take about half a year to perform, which is not feasible and not practical.

    The server specs are: Ubuntu 12.04 with latest updates + zimbra open source 8.0.7 + latest zextras bundle in trial mode.

    Many thanks

    Andrei

  2. #2
    ZeXtras Community Manager ZeXtras Employee Cine's Avatar
    Join Date
    Apr 2011
    Posts
    2,356
    Hello Andrei,
    welcome to the forums!

    First of all, I'd like you to check if the import is still running, as it's possible that a mailboxd restart interrupted the operation: in order to do so, you just need to run the following command as the 'zimbra' user:

    Code:
    zxsuite backup getAllOperations
    This will show the currently running operation and the operation queue. If the operation is not running, please check your mailbox.log for errors and make sure that the mailboxd service was not restarted - either manually or automatically.
    This command will also display the Operation ID for the import (which you can also find in the "Operation Started" notification), which you can use as the argument for the "monitor" command that will output the progress of an operation:

    Code:
    zxsuite backup monitor operation_ID
    Secondly, while CPU and RAM are important for ZeXtras Suite, in your case the main bottleneck is caused by I/O throughtput: can you please tell me what's your setup?

    Have a nice day,
    Cine
    the ZeXtras Team

  3. #3
    ZeXtras Community Manager ZeXtras Employee Cine's Avatar
    Join Date
    Apr 2011
    Posts
    2,356
    Hello again

    I also suggest to make sure that both your Zimbra JVM options and MYSQL optimizations are properly set up...

    Have a nice day,
    Cine
    the ZeXtras Team

  4. #4
    Active Member
    Join Date
    Aug 2014
    Posts
    5
    Hi Cine,

    The import is still running. I am keeping an eye on the progress. From what I can see the output of the "zxsuite backup monitor <id>" command refreshes every 5 seconds and the values of Restored/Skipped items changes by around 50 messages with every refresh. Occasionally it doesn't change, but in general, it's about 50 messages with every request.


    The output of the "zxsuite backup getAllOperations" command shows the information on the backup.


    To answer your question "Secondly, while CPU and RAM are important for ZeXtras Suite, in your case the main bottleneck is caused by I/O throughtput: can you please tell me what's your setup?" - The zimbra server is running as a virtual machine on KVM hypervisor with rbd storage backend. According to benchmarking the storage cluster itself is capable of handling 100+MB/s in sequential reads/writes and can do about 3,000-4,000 random 4K iops. I am struggling to believe that the storage is causing the issues as we have initiated the import of the data from the same storage cluster and it took around 2 hours to export the entire domain as i've mentioned earlier.

    Also, if the IO throughput was the issue, I would have seen high iowait figures on the server and my disk utilisation would have showed high numbers. I am seeing practically zero utilisation and close to zero iowait. To me it is not looking like an io throughput issue.

    Thanks

    Andrei

  5. #5
    CTO ZeXtras Employee d0s0n's Avatar
    Join Date
    Apr 2011
    Posts
    570

    Lightbulb

    Hi Andrei,

    it seems a bit weird so to better understand your enviroment I can suggest a simple profiling method through the zmthrdump command (as zimbra user):

    Code:
    for i in {1..60}; do echo -n .; zmthrdump > /tmp/thread_dump_$i; sleep 1.0; done;
    tar cvzf /tmp/thread_dump.tgz /tmp/thread_dump_*
    if you can send us (community[at]zextras.com) the result archive we can try to find your bottleneck (these data shouldn't contains any confidence infomations).

    D0s0n
    ZeXtras Website # ZeXtras Wiki # ZeXtras Store

    Head of ZeXtras System Administrators

  6. #6
    Active Member
    Join Date
    Aug 2014
    Posts
    5
    Quote Originally Posted by d0s0n View Post
    Hi Andrei,

    it seems a bit weird so to better understand your enviroment I can suggest a simple profiling method through the zmthrdump command (as zimbra user):

    Code:
    for i in {1..60}; do echo -n .; zmthrdump > /tmp/thread_dump_$i; sleep 1.0; done;
    tar cvzf /tmp/thread_dump.tgz /tmp/thread_dump_*
    if you can send us (community[at]zextras.com) the result archive we can try to find your bottleneck (these data shouldn't contains any confidence infomations).

    D0s0n

    I've just checked the java and sql tunings as per your link and it seems that the only value which was not set according to the guide is the "innodb_max_dirty_pages_pct", which was set to 30. I've updated the value to 10 using: set @@global.innodb_max_dirty_pages_pct = 10;

    However, going back to the running import - the change did not speed up the process. Still getting about 50 messages with every refresh.

    Please advise.

    Andrei

    P.S. I will do the commands and send over the archive shortly.

  7. #7
    Active Member
    Join Date
    Aug 2014
    Posts
    5
    Sent the files as per your request.

  8. #8
    ZeXtras Community Manager ZeXtras Employee Cine's Avatar
    Join Date
    Apr 2011
    Posts
    2,356
    Hello Andrei!

    We inspected the files you sent, and it looks like the cause of the slowdown is the "Apply HSM Policy during restore" you enabled. This can have a high impact on the duration of the import since items are moved in batches, so I'd suggest you to proceed as follow:

    - Using ZeXtras Powerstore, create a new "Primary" volume
    - Enable compression on the new volume and mark it as current
    - Run the import without selecting the "Apply HSM" options.
    - Once the import is completed, set the original volume as "current"
    - Use the ZeXtras Powerstore CLI (run "zxsuite powerstore" for a command list) to change the new volume - which now contains the majority of your data - from "primary" to "secondary".
    - For good measure, deduplicate the contents of the volume using the "zxsuite powerstore doDeduplicate" command.

    This way, your HSM target will already be holding most of your items - compressed and deduplicated - so that you won't need to move anything.


    Have a nice day,
    Cine
    the ZeXtras Team

  9. #9
    Active Member
    Join Date
    Aug 2014
    Posts
    5
    Quote Originally Posted by Cine View Post
    Hello Andrei!

    We inspected the files you sent, and it looks like the cause of the slowdown is the "Apply HSM Policy during restore" you enabled. This can have a high impact on the duration of the import since items are moved in batches, so I'd suggest you to proceed as follow:

    - Using ZeXtras Powerstore, create a new "Primary" volume
    - Enable compression on the new volume and mark it as current
    - Run the import without selecting the "Apply HSM" options.
    - Once the import is completed, set the original volume as "current"
    - Use the ZeXtras Powerstore CLI (run "zxsuite powerstore" for a command list) to change the new volume - which now contains the majority of your data - from "primary" to "secondary".
    - For good measure, deduplicate the contents of the volume using the "zxsuite powerstore doDeduplicate" command.

    This way, your HSM target will already be holding most of your items - compressed and deduplicated - so that you won't need to move anything.


    Have a nice day,
    Cine
    the ZeXtras Team

    Thanks for your suggestion.

    Could you please let me know if I can run the import without the HSM and enable HSM after the import. Would this migrate emails to HSM over time after the import?

    Thanks

  10. #10
    ZeXtras Community Manager ZeXtras Employee Cine's Avatar
    Join Date
    Apr 2011
    Posts
    2,356
    Quote Originally Posted by mozg View Post
    Thanks for your suggestion.

    Could you please let me know if I can run the import without the HSM and enable HSM after the import. Would this migrate emails to HSM over time after the import?

    Thanks
    Sure, that will work too, but keep in mind that the first HSM move will be quite large as many items will be moved so make sure to schedule it in a low-load moment...

    Have a nice day,
    Cine
    the ZeXtras Team

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •