ZxBackup Operation Purge failed on some mailbox servers - Page 2
Page 2 of 2 FirstFirst 12
Results 11 to 18 of 18
Like Tree2Likes

Thread: ZxBackup Operation Purge failed on some mailbox servers

  1. #11
    Senior Member
    Join Date
    Oct 2013
    Posts
    70
    With RealTime Scanner turned off, operation Purge completed successfully on zm-mbox-01 (after 39 hours!).

  2. #12
    ZeXtras Community Manager ZeXtras Employee Cine's Avatar
    Join Date
    Apr 2011
    Posts
    2,342
    Hello stsimb!

    I'm glad the purge was completed, the duration is heavily affected by the issue we discussed about and by the high number of concurrent read/writes but ZeXtras Suite 1.8.18 should fix the cause of this issue and lower the duration of both the SmartScan and the Purge to a sane value.

    An update for all users reading this thread:
    We found an anomalous situation involving External Accounts, where Zimbra updated a folder metadata once a minute making the related metadata stored by ZeXtras Backup to abnormally grow in size thus making the RealTime scanner and any other operation highly unstable and memory-hungry.

    A fix will be included in ZeXtras Suite 1.8.18.

    Have a nice day,
    Cine
    the ZeXtras Team
    IT Support Team Contact Form
    Sales Team Contact Form

    ZeXtras Website
    # ZeXtras Wiki # ZeXtras Store

    Have ZeXtras Suite or ZeXtras Migration Tool been helpful to you?
    Share your experience in the Zimbra Gallery!

    ZeXtras Suite on the Zimbra Gallery
    ZeXtras Migration Tool on the Zimbra Gallery

  3. #13
    Senior Member
    Join Date
    Oct 2013
    Posts
    70
    On zm-mbox-02 Purge operation started almost a week ago. It has scanned over 20m items, and since this afternoon it appears to have hit the 80k hardcoded item limit of ZeXtras 1.8.18.
    Code:
                    startTime                                                   01/10/2014 00:11:06
                    module                                                      ZxBackup
                    name                                                        DoPurge
    
     Account scanned:                  207/208
     Item scanned:                     20422872
     Items per second:                 49
    
    2014-10-06 15:22:00,944 INFO  [ZeXtras Real Time Notifier Thread] [] extensions - Unable to queue operation ItemScan from ZxBackup. Cause: Cannot queue more than 80000 operations, server is probably under heavy load.
    ...
    2014-10-06 19:55:53,393 INFO  [ZeXtras Real Time Notifier Thread] [] extensions - Unable to queue operation ItemScan from ZxBackup. Cause: Cannot queue more than 80000 operations, server is probably under heavy load.
    any advice please?

  4. #14
    Senior Member
    Join Date
    Oct 2013
    Posts
    70
    This Purge Operation on zm-mbox-02 never happened.

    Too many small files on filesystem (>20 million) and too much user impact while Purge Operation was running (after i/o scan phase, CPU constantly at 100% and triggered too many GC runs and caused long freezes in mailboxd).

    We moved the old dir out of the way and re-initialized ZxBackup on this mailbox server (running ZeXtras 1.10.0 now).
    Even deleting the old dir took ~60 hours with multiple parallel rm -rf's running in the background in batches of 50 dirs..

    This mailbox server contained accounts with many small log files, similar to the ones reported on http://forums.zextras.com/zxbackup/1....html#post5260
    Last edited by stsimb; 11-10-2014 at 09:51 AM.

  5. #15
    ZeXtras Community Manager ZeXtras Employee Cine's Avatar
    Join Date
    Apr 2011
    Posts
    2,342
    Hello stsimb!

    I'd like to have some more info about your storage system:

    - What hardware is it based on?
    - Do you use a RAID array? LVM?
    - Is the storage shared between all of your mailbox servers? If so,
    - is it dedicated to ZeXtras Backup?
    - Are you using the same filesystem for all mailbox servers?
    - What's the retention set up in ZeXtras Backup?

    Have a nice day,
    Cine
    the ZeXtras Team
    IT Support Team Contact Form
    Sales Team Contact Form

    ZeXtras Website
    # ZeXtras Wiki # ZeXtras Store

    Have ZeXtras Suite or ZeXtras Migration Tool been helpful to you?
    Share your experience in the Zimbra Gallery!

    ZeXtras Suite on the Zimbra Gallery
    ZeXtras Migration Tool on the Zimbra Gallery

  6. #16
    Senior Member
    Join Date
    Oct 2013
    Posts
    70
    The storage is an IBM v7000 SAN, connected FC to a physical HP DL360 server.
    A 1TB LUN with RAID5 SATA disks is dedicated for ZeXtras Backup, and shared via NFS to the mailboxd servers.
    Same filesystem for all (ext4).
    The mailbox servers are VMs in ESXi 5.1 Hosts.
    Retention is set to 30 days for all mailbox servers.

    But let me state some other facts as well..

    The "old" directory of zm-mbox-02 was last successfully purged on 2/7/2014 (or rather Full Scanned, since it was ZeXtras version before 1.8.17 back then).
    That successful Full Scan reported 5.3 million items checked, and took for 4 days and 3 hours.

    After that time, for various reasons, there was no successful full scan (or purge) of this dir, and it kept growing (for 4 months unpurged).
    The last unsuccessful purge attempt was last week, took more than 7 days and it reported 22 million items traversed.
    Reasons for failing over the 4 months period were... taking too long, invalid characters reported, overlapping with smartscans and taking too much cpu time, users reporting zimbra as painfully slow etc...

    So we ditched the old dir, and initialized again.
    The SmartScan that successfully finished a few hours ago in the new (empty) subdir on the same storage run for 3 days, 5 hours, 4 minutes, 40 seconds.
    Code:
                       new accounts: 177
                   accounts updated: 0
           skipped accounts(by COS): 0
                       item updated: 0
                       new metadata: 5012030
                          new files: 3220510
                      checked items: 5138459
                        backup path: /mail-backup/zm-mbox-02.cloud.forthnet.prv/
                      skipped items: 0
                 I/O read exception: 0
                  num skipped files: 0
                          items/sec: 18
    We now plan to keep is as small as possible, running purge weekly (I don't know if there is enough time to run it daily or every other day)...

  7. #17
    ZeXtras Community Manager ZeXtras Employee Cine's Avatar
    Join Date
    Apr 2011
    Posts
    2,342
    Hello stsimb!

    Thank you for summarizing all the related info: I've had a word about this with D0s_0n - which is by far more expert than me in this kind of tuning - and his suggestion is to export the NFS at VMWare-level instead of directly to the servers thus allowing to create separate vDisks for each mailbox server allowing a improved inode management and more analysis/fine tuning options.

    The bottleneck appears to be the single DL360, could you please confirm that all of your servers refer to it for their storage needs? In this case, direct FC connection to the ESXi hosts or iSCSI connection could provide much better performances, but I'm getting pretty far from my scope as the ZeXtras IT Support so I'll stop here...

    Please feel free to get in touch anytime if you notice any further performance issue!

    Have a nice day,
    Cine
    the ZeXtras Team
    IT Support Team Contact Form
    Sales Team Contact Form

    ZeXtras Website
    # ZeXtras Wiki # ZeXtras Store

    Have ZeXtras Suite or ZeXtras Migration Tool been helpful to you?
    Share your experience in the Zimbra Gallery!

    ZeXtras Suite on the Zimbra Gallery
    ZeXtras Migration Tool on the Zimbra Gallery

  8. #18
    Senior Member
    Join Date
    Oct 2013
    Posts
    70
    Quote Originally Posted by Cine View Post
    Thank you for summarizing all the related info: I've had a word about this with D0s_0n - which is by far more expert than me in this kind of tuning - and his suggestion is to export the NFS at VMWare-level instead of directly to the servers thus allowing to create separate vDisks for each mailbox server allowing a improved inode management and more analysis/fine tuning options.

    The bottleneck appears to be the single DL360, could you please confirm that all of your servers refer to it for their storage needs? In this case, direct FC connection to the ESXi hosts or iSCSI connection could provide much better performances, but I'm getting pretty far from my scope as the ZeXtras IT Support so I'll stop here...
    All our mailbox servers refer to the NFS server only for the ZxBackup filesystem.
    Main storage and ZxPowerstore module move the blobs to a different and seperate filesystem, directly attached to each server.

    ZxBackup writes to a single filesystem over NFS "by choice", we didn't know how much it would grow over time.
    This setup will probably change in the following months, because now we've seen how it works, we know how fast it is, how much it scales etc

    Thanks for all the suggestions Cine & D0s_0n!
    d0s0n and Cine like this.

Page 2 of 2 FirstFirst 12

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •