PDA

View Full Version : File systems and changed data


jlreich
10-29-2005, 09:50 AM
From another thread (http://www.pcguide.com/vb/showthread.php?p=256124#post256124) about data recovery.

On FAT partitions, at the most basic, the data for an updated file stays where it is as long as no new clusters need to be utilised; in other words the FAT tables are not changed at all even though the ones and zeros will have been changed in the data area that they reference.

Changing the file name or other attributes creates a new entry in the appropriate directory area but doesnt need to rewrite the data area either. If the file becomes contracted or expanded such that the FAT tables need editing then the OS makes a decision whether to just extend/contract the existing "daisy-chain" or write a complete new area with the relevant data. This would be done whenever possible if it would prevent the file from becoming fragmented. If the data gets written in a new location the original file data stays where it is until it gets overwritten. In such circumstances the original may well be recoverable by good software - but it is by no means a given.

The best way of oberving these events is to create a small partition, write zeros to it and then format it as FAT. Then create one small file on it. Then make minor alterations to it; etc, etc. Examine each stage with a Hex editor and everything begins to fall into place.
Most interesting Paul. Your knowledge of OS's, hard drives, file systems, and data recovery continues to astound me.

If you don't mind me asking, how does NTFS handle such a change? Would it be more or less likely to recover the original of a wrongly updated file? What about a Linux file system?

Paul Komski
10-30-2005, 12:55 PM
The simple answer is I don't know the definitive answer about NTFS/ext2/Reiserfs other than they are more similar to one another and very different from FAT. I just havent had the chance to experiment and that is the only real way to learn. Only recently have I found an app (WinHex) which allows me to extract the MFT itself; up to now it was an extremely occult entity.

One has no hope of understanding NTFS until it is grasped that everything inside the partition is a collection of files; files that can have a variety of attributes; attributes that can be resident or non-resident to their master file table ($MFT) entry.

The $MFT is the "backbone" (itself possibly fragmented) of the whole NTFS. Each entry has four main attributes: $STANDARD_INFORMATION, $FILE_NAME, $DATA and $BITMAP. If the file is small then all the file's data can be completely kept in the $MFT. If the data needs to be non-resident then that "non-resident" attribute contains the information about where the data-runs are on disk as referenced by absolute LCN (Logical Cluster Numbers). If there is no room for the non-resident attribute in its MFT entry a new MFT entry is created for it and an attribute-list attribute created to link the attributes of that one file entry; that doesnt happen very often.

Simplistically, there is a list of short entries in the MFT entry that defines where a file's data can be found such that the references are generally all kept in one place and the daisy chain (of a FAT) doesn't come into play and each fragment is not dependent on the previous fragment finding it via the FATs; (this is just one reason why fragmented files are less of a performance problem under NTFS than FAT) and why data recovery of deleted files or corrupt partitions is generally better.

Without actually experimenting with a hex editor (which is much easier to do with a FAT FS) I can only assume that such data recovery is probably equivalent to what goes on under the FAT FS. If files contract after being updated there is little in it for the OS to rewrite a whole new MFT entry and put the data in new data-runs. With an increase in file size I can only imagine that it is less likely to rewrite a whole new data entry but simply add new data-run references in the MFT attributes.

I thus think, but just don't know, that such updated files are less likely to be recovered than under FAT. Even with changes in file names there is less likely to be much editing of the data area since NTFS (and the Linux FSystems) allow multiple "directory" entries with different names etc to reference the same data area.

I know I dont have THE answer but I hope that gives a little insight.

jlreich
10-30-2005, 01:43 PM
Thank you so much Paul.

So your saying that if a files data is small enough it will actually be kept in the MTF itself, and not actually written anywhere else on the partition? And if I remember right there is a backup of the MTF that is mirrored. Meaning it is written at the same time as the first and there is no chance of it still having the old references or the actual data if resident? It's just a mirror in case the first becomes corrupted?

And seeing how there would be no reason most of the time for the OS to write a new MTF entry for it, or actually rewrite the data altogether in a new location, then data recovery of the type we are talking about is very unlikely, at least compared to the FAT FS.

Then again, if there is a reference appended to the MTF is it time stamped? And is it referenced where the old/new data begins and ends? If so then it seems that recovery would be relatively simple, depending on how many times the actual file has been overwritten. Well it would seem that way to my unknowing mind anyway. :rolleyes:

Thanks again. It does help a lot. And sorry for my rambling, I'm just kind of thinking out loud. :rolleyes:

Paul Komski
10-30-2005, 02:58 PM
Pedantically - for MTF read MFT. The $MFTMirr is not a mirror of the whole $MFT only the first 4 records. Those records plus the $LogFile will however often allow a damaged $MFT to be rebuilt. I havent ever looked into the innermosts of the $LogFile but it would be the most likely way or re-referencing any old data.

I have never come across different parts of a file's data being given different timestamps. Timestamp attributes for "created" and "modified" would apply to the file as a whole.

Good info on NTFS at http://linux-ntfs.sourceforge.net/ntfs/

jlreich
10-30-2005, 04:26 PM
Ok I thought it was a true mirror. Good to know.

I have been reading the link you gave and it is very interesting. :cool: It's actually even more complicated at that level than I thought it would be.

I need a new laptop so I can go sit on the can where I do my best thinking. :p You know the commercial where the boss takes everyone into the shower? Well I would have to have everyone sit on the can. :eek: :D

Hmm....I might not have very many employees for long.....LOL :eek:

Thanks again Paul.