Unfortunately there is still a widely held belief that file fragmentation is a thing of the past and not necessary. Some IT guys even mention they read somewhere that “NTFS does not need fragmentation” or somehow prevents fragmentation. Unfortunately the truth is, things haven’t changed at all and file fragmentation is the result of the necessity of organizing files in linear storage. Fragmentation occurs due to the “physics” of storing data on disk drives and (almost) nothing can avoid it.
Each file typically use multiple sectors on the disk. When we place files side by side and then need to extend a file, the result is fragmentation. Obviously the file cannot be extended unless the file next to it is moved away, or we split the file and extend it elsewhere; hence, we created a fragmented file. No magic on earth can change that fact; however, there are of course algorithms in place to avoid this from happening with some probability but not with certainty.
Another place where fragmentation is evident and causes a performance degradation is the NTFS Master File Table. Small files get written directly into it and can fill it up quickly. This leads to the file system having to allocate more space elsewhere, i.e. a fragment, to extend the MFT.
As time goes by, even on a disk or disk array that is mostly empty, files and folders get fragmented. Since backups are usually sent to cheaper and more voluminous mechanic drives, each time the disk encounters a fragmented file, the heads have to be moved to the new file position and this incurs a seek time. Usually the seek time is in the range of a few ms. In a heavily fragmented file system, these milliseconds quickly add up to seconds and seconds to minutes and finally hours of unnecessary processing.
In short, if you want faster backups, you must defragment your backup drives as well as your data drives where the data is stored. Some intelligent defragmentation software also offers MFT defragmentation and fragmentation avoidance by leaving some space behind files that are likely to grow. Dynamically expanding virtual machine disk files are one of the most likely category of files where excessive file fragmentation becomes evident, since virtual disks constantly expand over time.
But it’s not only virtual disk files and virtual disk backups that are affected. Typical file server data is also not immune to fragmentation.
Consider the defrag output of this disk array that holds a file server backup for about two years now:
Pre-Optimization Report: Volume Information: Volume size = 10.91 TB Cluster size = 4 KB Used space = 2.28 TB Free space = 8.63 TB Fragmentation: Total fragmented space = 79% Average fragments per file = 2.47 Movable files and folders = 2311698 Unmovable files and folders = 4 Files: Fragmented files = 1133584 Total file fragments = 3381277 Folders: Total folders = 24824 Fragmented folders = 6039 Total folder fragments = 41735 Free space: Free space count = 409309 Average free space size = 22.10 MB Largest free space size = 379.21 GB Master File Table (MFT): MFT size = 2.37 GB MFT record count = 2495231 MFT usage = 100% Total MFT fragments = 45
This file server data backup array shows massive levels of disk fragmentation after being used for less than two years for nightly file server data, which consist only of documents, not virtual disk data.
Note the MFT is full and fragmented over 45 times! The largest free space is just 378 GB, even though the 11TB disk array still has a total of over 8.6 TB free.
The number of fragmented files and folders are also very high.
When a backup is run, the source and destination need to be compared and scanned. With fragmented drives, there will be millions of unnecessary head movements. Hence, the accumulated seek time causes backups to run for hours more than actually necessary.
Do I need to defragment SSD drives? Yes and no. Intelligent disk defragmentation software exists that can reduce the wear of the flash cells inside the SSD drives. Also, the speeds quoted by SSD manufacturers also refer to consecutive reading and writing. When blocks are dispersed randomly, the SSD drive performance also suffers. The effect is not as dramatic as with mechanical drives, however. The wear and tear on SSD drives, on the other hand, is a major cause of premature disk failure. So even if your SSD performs okay despite being fragmented, it may cause earlier failure.
How to Speed up Backups
Especially in the case of file server backups, database, and virtual machine backups, run the defragmentation tool of your choice at regular intervals. If you are dealing with fixed sized virtual disks and databases, the defragmentation only needs to be done once on the host. However, for file server data it’s best to defragment often. Do not forget to also defragment your backup drives. To speed up file server backups, our backup software offers a feature to process multiple files simultaneously. This is especially useful when dealing with many small files, such as document files, as there is a certain amount of overhead when processing individual files.
Most of the time it’s best, however, not to run backups in parallel. Hard drives and system look-ahead caching are optimized to offer the best throughput when reading consecutive sectors. Even on most SSD drives performance is better when data is in a single block rather than split into thousands of fragments. As a bonus, some defragmentation tools offer fragmentation avoidance, which not only improves performance, but also reduces the wear and tear on your hardware.