Learn about the technologies behind the Internet with The TCP/IP Guide!
NOTE: Using robot software to mass-download the site degrades the server and is prohibited. See here for more.
Find The PC Guide helpful? Please consider a donation to The PC Guide Tip Jar. Visa/MC/Paypal accepted.
View over 750 of my fine art photos any time for free at DesktopScenes.com!

[ The PC Guide | Systems and Components Reference Guide | Hard Disk Drives | Hard Disk Logical Structures and File Systems | New Technology File System (NTFS) | Other NTFS Features and Advantages ]

Sparse File Support

One of the more interesting features added to NTFS version 5.0 under Windows 2000 is special support for sparse files. As the name suggests, these are files that are primarily empty. Now, it is an inherent characteristic of most files that they contain areas of "empty space"--long stretches of zeroes. These zeroes are stored along with the "real data" in the file, and take up space the same way. If files are not too large, storing these zeroes does not represent much of a problem. Sure, some space is expended on these "empty" areas, but this is expected and manageable.

However, there are certain applications that may employ very large files that contain extremely large areas of empty space. A common example is a large database file, which may contain strings of millions of zeroes, representing either deleted data, or a "space-holder" for future data storage--such as unused database fields. There are also scientific applications that make use of very large files where small pieces of data must be spread throughout the file, but much of the file is empty. If such a file is stored in the conventional way, a great deal of disk space is wasted on these repeated "blanks".

You may be wondering--why bother with support for a special feature to address these files when NTFS already includes built-in compression, which could be used to save the space used by the long strings of zeros in sparse files? It's a good question. :^) The most likely reason is that compression introduces overhead that may not be acceptable for critical applications. Much as database systems are likely to have sparse files, they are also often heavily-used applications where compression might introduce unacceptable performance degradation.

To improve the efficiency of storing sparse files without necessitating the use of compression, a special feature was incorporated into NTFS under Windows 2000. A file that is likely to contain many zeroes can be tagged as a sparse file. When this is done, a special attribute is associated with the file, and it is stored in a different way from regular files. The actual data in the files is stored on the disk, and NTFS keeps track of where in the file each chunk of data belongs. The rest of the file's bytes--the zeroes--are not stored on disk. NTFS seamlessly manages sparse files so that they appear to applications like regular files. So, for example, when an application asks to read a particular sequence of bytes from a sparse file, NTFS will return both regular data and zeroes, as appropriate, just as it would for a regular file. The application can't tell that NTFS is playing "storage games" on the disk.

Note: Tagging a file as sparse is a "one-way trip"--you cannot change the file back into a normal file again.

Next: NTFS Implementation Considerations

Home  -  Search  -  Topics  -  Up

The PC Guide (http://www.PCGuide.com)
Site Version: 2.2.0 - Version Date: April 17, 2001
Copyright 1997-2004 Charles M. Kozierok. All Rights Reserved.

Not responsible for any loss resulting from the use of this site.
Please read the Site Guide before using this material.
Custom Search