Learn about the technologies behind the Internet with The TCP/IP Guide!
NOTE: Using robot software to mass-download the site degrades the server and is prohibited. See here for more.
Find The PC Guide helpful? Please consider a donation to The PC Guide Tip Jar. Visa/MC/Paypal accepted.
View over 750 of my fine art photos any time for free at DesktopScenes.com!

[ The PC Guide | Systems and Components Reference Guide | Hard Disk Drives | Hard Disk Logical Structures and File Systems | New Technology File System (NTFS) | NTFS Directories and Files ]

NTFS Files and Data Storage

As with most file systems, the fundamental unit of storage in NTFS from the user's perspective is the file. A file is just a collection of any sort of data, and can contain anything: programs, text files, audio clips, database records--and thousands of other kinds of information. The operating system doesn't distinguish between types of files. The use of a particular file depends on how it is interpreted by applications that use it.

Within NTFS, all files are stored in pretty much the same way: as a collection of attributes. This includes the data in the file itself, which is just another attribute: the "data attribute", technically. Note that to understand how NTFS stores files, one must first understand the basics of NTFS architecture, and in particular, it's good to comprehend what the Master File Table is and how it works. You may also wish to review the discussion of NTFS attributes, because understanding the difference between resident and non-resident attributes is important to making any sense at all of the rest of this page. ;^)

The way that data is stored in files in NTFS depends on the size of the file. The core structure of each file is based on the following information and attributes that are stored for each file:

  • Header (H): The header in the MFT is a set of low-level management data used by NTFS to manage the directory. It includes sequence numbers used internally by NTFS and pointers to the file's other attributes and free space within the record. (Note that the header is part of the MFT record but not an attribute.)
  • Standard Information Attribute (SI): This attribute contains "standard" information stored for all files and directories. This includes fundamental properties such as date/time-stamps for when the file was created, modified and accessed. It also contains the "standard" FAT-like attributes usually associated with a file (such as whether the file is read-only, hidden, and so on.)
  • File Name Attribute (FN): This attribute stores the name associated with the file. Note that a file can have multiple file name attributes, to allow the storage of the "regular" name of the file, along with an MS-DOS short filename alias and also POSIX-like hard links from multiple directories. See here for more on NTFS file naming.
  • Data (Data) Attribute: This attribute stores the actual contents of the file.
  • Security Descriptor (SD) Attribute: This attribute contains security information that controls access to the file. The file's Access Control Lists (ACLs) and related data are stored here.

These are the basic attributes; others may also be associated with a file (see this full discussion of attributes for details). If a file is small enough that all of its attributes can fit within the MFT record for the file, it is stored entirely within the MFT. Whether this happens or not depends largely on the size of the MFT records used on the volume. If the file is too large for all of the attributes to fit in the MFT, NTFS begins a series of "expansions" that move attributes out of the MFT and and make them non-resident. The sequence of steps taken is something like this:

  1. First, NTFS will attempt to store the entire file in the MFT entry, if possible. This will generally happen only for rather small files.
  2. If the file is too large to fit in the MFT record, the data attribute is made non-resident. The entry for the data attribute in the MFT contains pointers to data runs (also called extents) which are blocks of data stored in contiguous sections of the volume, outside the MFT.
  3. The file may become so large that there isn't even room in the MFT record for the list of pointers in the data attribute. If this happens, the list of data attribute pointers is itself made non-resident. Such a file will have no data attribute in its main MFT record; instead, a pointer is placed in the main MFT record to a second MFT record that contains the data attribute's list of pointers to data runs.
  4. NTFS will continue to extend this flexible structure if very large files are created. It can create multiple non-resident MFT records if needed to store a great number of pointers to different data runs. Obviously, the larger the file, the more complex the file storage structure becomes.

The data runs (extents) are where most file data in an NTFS volume is stored. These runs consist of blocks of contiguous clusters on the disk. The pointers in the data attribute(s) for the file contain a reference to the start of the run, and also the number of clusters in the run. The start of each run is identified using a virtual cluster number or VCN. The use of a "pointer+length" scheme means that under NTFS, it is not necessary to read each cluster of the file in order to determine where the next one in the file is located. This method also reduces fragmentation of files compared to the FAT setup.

Next: NTFS File Size

Home  -  Search  -  Topics  -  Up

The PC Guide (http://www.PCGuide.com)
Site Version: 2.2.0 - Version Date: April 17, 2001
Copyright 1997-2004 Charles M. Kozierok. All Rights Reserved.

Not responsible for any loss resulting from the use of this site.
Please read the Site Guide before using this material.
Custom Search