Re: Slow performance on file system backup even to disk

Hi,

The system doesn't seem to be under load (cpu / memory) so I can't see why sometimes this backup can take over 24 hours to complete.

Well, neither RAM nor CPU utilization are in any way indicative of I/O saturation. You are missing on the most crucial performance metric for the case at hand, the disk duty cycle of the source drive. This metric has been ignored for a long time by most OSes and their monitoring toolkits and only recently more admins became aware of it (I see that repeatedly when being asked for advice on slowness, people look for the bottleneck everywhere except for the source, which, when you think about it, is perplexing - yet I know I was ignorant of that myself just a decade ago). You don't specify the OS the source disk is attached to, giving you hints on measuring disk saturation would require that. The most typical tools I use are atop on Linux (has a great per-blockdevice load display that goes colored when you reach certain thresholds, often that is an admin eye-opener), perfmon on Windows up to 2003 (Time% for the physical drives is even one of the three default graphs) or Ressource Monitor on newer Windows versions (the blue bar on the storage tab).

All that being said, your 0,5 million files in 73GB lead to an average file size of 153KB, which isn't exactly the worst case lots-of-small-files situation. It's still bad, and still causes lots of random I/O (how bad exactly it is depends on OS and file system block/cluster size in use), but IMO it's not bad enough to justify 570KB/s average throughput (or maybe it does, but only if the source is a single spindle of some slowly rotating nearline SATA disk that saturates at 90 IOPS like a WD Red - they are great PVR disks, but clearly not made for this use case). So there may be something else going wrong here. First of all, you should measure the throughput the source can actually achieve when reading all the files in depth-first traversal. Bob already mentioned the backup to a local nul device as an option to test that, Daniel describes an even quicker method by just calling the VBDA directly. There are also others. When you know the source performance, you know what the backup could achieve at all - it will only get slower due to other factors, not faster then the source. If you find out your source is blazingly faster reading this heap of files than you actually see in a backup, then things get really interesting.

HTH,

Andre.

Latest Images

Trending Articles

Latest Images