Learn about the technologies behind the Internet with The TCP/IP Guide!|
NOTE: Using robot software to mass-download the site degrades the server and is prohibited. See here for more.
Find The PC Guide helpful? Please consider a donation to The PC Guide Tip Jar. Visa/MC/Paypal accepted.
|View over 750 of my fine art photos any time for free at DesktopScenes.com!|
FAT Partition Efficiency: Slack
One issue related to the FAT file system that has gained a lot more attention over the years is the concept of slack, which is the colloquial term used to refer to wasted space due to the use of clusters for storing files. This began in the mid-1990s when larger and larger hard disks began shipping with most systems. Typically, retail systems were not being divided into multiple partitions, and users began noticing that large quantities of their hard disk seem to "disappear". In many cases this amounted to hundreds of megabytes on a disk of only 1 to 2 GB in size. When the use of FAT32 became more common this problem was less of an issue for a while. Today, with hard disks sized at 40 GB or more commonplace, even FAT32 has problems with slack.
Of course the space doesn't really "disappear", assuming we are not talking about lost clusters, which can make space really unusable on a disk unless you use a scanning utility to recover it. The space is simply wasted as a result of the cluster system that FAT uses. A cluster is the minimum amount of space that can be assigned to any file. No file can use part of a cluster under the FAT file system. This means, essentially, that the amount of space a file uses on the disk is "rounded up" to an integer multiple of the cluster size. If you create a file containing exactly one byte, it will still use an entire cluster's worth of space. Then, you can expand that file in size until it reaches the maximum size of a cluster, and it will take up no additional space during that expansion. As soon as you make the file larger than what a single cluster can hold, a second cluster will be allocated, and the file's disk usage will double, even though the file only increased in size by one byte.
Think of this in terms of collecting rain water in quart-sized glass bottles. Even if you collect just one ounce of water, you have to use a whole bottle. Once the bottle is in use, however, you can fill it with 31 more ounces, until it is full. Then you'll need another whole bottle to hold the 33rd ounce.
Since files are always allocated whole clusters, this means that on average, the larger the cluster size of the volume, the more space that will be wasted. (When collecting rain water, it's more efficient to use smaller, cup-sized bottles instead of quart-sized ones, if minimizing the amount of storage space is a concern). If we take a disk that has a truly random distribution of file sizes, then on average each file wastes half a cluster. (They use any number of whole clusters and then a random amount of the last cluster, so on average half a cluster is wasted). This means that if you double the cluster size of the disk, you double the amount of storage that is wasted. Storage space that is wasted in this manner, due to space left at the end of the last cluster allocated to the file, is commonly called slack.
The situation is in reality usually worse than this theoretical average. The files on most hard disks don't follow a random size pattern, in fact most files tend to be small in size. (Take a look in your web browser's cache directory sometime!) A hard disk that uses more small files will result in far more space being wasted. There are utilities that you can use to analyze the amount of wasted space on your disk volumes, such as the fantastic Partition Magic. It is not uncommon for very large disks that are in single partitions to waste up to 40% of their space due to slack, although 25-30% is more common.
Let's take an example to illustrate the situation. Let's consider a hard disk volume that is using 32 kiB clusters. There are 17,000 files in the partition. If we assume that each file has half a cluster of slack, then this means that we are wasting 16 kiB of space per file. Multiply that by 17,000 files, and we get a total of 265 MB of slack space. If we assume that most of the files are smaller, and so therefore on average each file has slack space of around two-thirds of a cluster instead of one-half, this jumps to 354 MB!
If we were able to use a smaller cluster size for this disk, the amount of space wasted would reduce dramatically. The table below shows a comparison of the slack for various cluster sizes for this example. The more files on the disk, the worse the slack gets. To consider the percentage of disk space wasted in this example, divide the slack figure by the size of the disk. So if this were a (full) 1.2 GB disk using 32 kiB clusters, a full 30% of that space is slack. If the disk is 2.1 GB in size, the slack percentage is 17%:
As you can see, the larger the cluster size used, the more of the disk's space is wasted due to slack. Therefore, it is better to use smaller cluster sizes whenever possible. This is, unfortunately, sometimes easier said than done. The number of clusters we can use is limited by the nature of the FAT file system, and there are also performance tradeoffs in using smaller cluster sizes. Therefore, it isn't always possible to use the absolute smallest cluster size in order to maximize free space. One way that cluster sizes can be reduced is to use FAT32 instead of FAT16, as described in other pages in this section. However, on very large modern hard disks, big partitions even in FAT32 use rather hefty cluster sizes!
Also realize that there will always be some space wasted regardless of the cluster size chosen. Most people consider the amount of slack obtained when using 4 kiB or 8 kiB partitions to be acceptable; most consider the slack of 32 kiB cluster size partitions excessive; and the 16 kiB partitions seem to go both ways. It depends entirely on your needs, and how critical your disk space is. With today's large disks, many people don't care as much about slack as they used to; a typical PC user bringing home a new machine with a 30 GB hard disk isn't going to get it half full even after quite a while. For others, slack is still very important. I personally only avoid the 32 kiB partitions when possible, but I (more than many others) also dislike having my disk broken into many pieces. See this discussion of the tradeoffs between slack space waste and "end of volume" space waste as well for more perspective on choosing cluster sizes.
Tip: Do remember not to go
overboard in your efforts to avoid slack. To keep it all in perspective, let's take the
worst case above, where 354 MB of space is wasted. With the cost per megabyte of disk now
below 1 cent, this means that the "cost" of this problem is less than $5. That
doesn't mean that wasting hundreds of megabytes of storage is smart; obviously I don't
think that or I wouldn't have written so much about slack and partitioning. :^) But on the
other hand, spending 20 hours and $50 on utility software to avoid it may not be too smart
either, considering that for not much more than that you can get a second hard disk with
dozens of gigabytes! Moderation is often the key to using partitioning to reduce slack, so
don't be taken in by some of the "partitioning fanatics" who seem to have lost
sight of the fact that disk space is really very cheap today.