File System Fragmentation - Cause

Cause

When a file system is first initialized on a partition (the partition is formatted for the file system), the partition contains only a few small internal structures and is otherwise one contiguous block of empty space. This means that the allocator algorithm is completely free to place newly created files anywhere on the partition. For some time after creation, files on the file system can be laid out near-optimally. When the operating system and applications are installed or other archives are unpacked, laying out separate files sequentially also means that related files are likely to be positioned close to each other.

However, as existing files are deleted or truncated, new regions of free space are created. When existing files are appended to, it is often impossible to resume the write exactly where the file used to end, as another file may already be allocated there — thus, a new fragment has to be allocated. As time goes on, and the same factors are continuously present, free space as well as frequently appended files tend to fragment more. Shorter regions of free space also mean that the allocator is no longer able to allocate new files contiguously, and has to break them into fragments. This is especially true when the file system is more full — longer contiguous regions of free space are less likely to occur.

Note that the following is a simplification of an otherwise complicated subject. The method which is about to be explained has been the general practice for allocating files on disk and other random-access storage for over 30 years. Some operating systems do not simply allocate files one after the other, and some use various methods to try to prevent fragmentation, but in general, sooner or later, for the reasons explained in the following explanation, fragmentation will occur as time goes by on any system where files are routinely deleted or expanded. Consider the following scenario, as shown by the image on the right:

A new disk has had 5 files saved on it, named A, B, C, D and E, and each file is using 10 blocks of space (here the block size is unimportant.) As the free space is contiguous the files are located one after the other (Example (1).)

If file B is deleted, a second region of 10 blocks of free space is created, and the disk becomes fragmented. The file system could defragment the disk immediately after a deletion, which would incur a severe performance penalty at unpredictable times, but in general the empty space is simply left there, marked in a table as available for later use, then used again as needed (Example (2).)

Now if a new file F requires 7 blocks of space, it can be placed into the first 7 blocks of the space formerly holding the file B, and the 3 blocks following it will remain available (Example (3).) If another new file G is added, and needs only three blocks, it could then occupy the space after F and before C (Example (4).)

If subsequently F needs to be expanded, since the space immediately following it is occupied, there are three options: (1) add a new block somewhere else and indicate that F has a second extent, (2) move files in the way of the expansion elsewhere, to allow F to remain contiguous; or (3) move file F so it can be one contiguous file of the new, larger size. The second option is probably impractical for performance reasons, as is the third when the file is very large. Indeed the third option is impossible when there is no single contiguous free space large enough to hold the new file. Thus the usual practice is simply to create an extent somewhere else and chain the new extent onto the old one (Example (5).)

Material added to the end of file F would be part of the same extent. But if there is so much material that no room is available after the last extent, then another extent would have to be created, and so on. Eventually the file system has free segments in many places and some files may be spread over many extents. Access time for those files (or for all files) may become excessively long.

To summarize, factors that typically cause or facilitate fragmentation include:

  • low free space.
  • frequent deletion, truncation or extension of files.
  • overuse of sparse files.

Read more about this topic:  File System Fragmentation