PDA

View Full Version : Total files same; Total bytes different?


Dinosaur
08-29-2005, 12:30 AM
I used a very reliable utility (Ztree for Windows) to copy all the files from a CD to a Folder on a hard disk. The utility had problems reading some files, but when told to retry, it did not give further error messages.

Previous attempts to copy using Windows Explorer terminated due to errors reading some files, indicating some problems with the CD.

A check of the copy reports that the number of files in the disk folder is the same as the number of files on the CD, but reports less bytes of data. The statistics are as follows.2066 files / 530,063,485 bytes on CD 2066 files / 529,479,356 bytes in disk folderCheck Disk via My Computer reports no errors for hard disk, which is using NTFS. The CD uses CDFS.

I am running Windows XP Home addition with Service Pack 2.

Any thoughts on what might have happened? Might all be well, with the discrepancy being due to different control data used by two file systems? Might it be due to different size allocation clusters?

pangea33
08-29-2005, 01:19 AM
I was going to post something about block sizes until I saw that you already did. Sure seems likely to me. Does everything seem to work alright? If not, you could always rip an ISO file, move it to your hard drive, and then mount it using a utility like alcohol 120%

saphalline
08-29-2005, 03:50 AM
Might all be well, with the discrepancy being due to different control data used by two file systems? Might it be due to different size allocation clusters?Yep, you hit the nail on the head. CDFS and NTFS are completely different file systems, and they have different uses. CDFS is defined very rigidly because it needs to be easily read by older systems (it WAS invented a long time ago afterall) and the size of a CD is pretty well known! :p CDFS was made for a specific purpose and it does it well, but it's not dynamic like NTFS or the FAT's. The difference in overhead related to what you just witnessed is based on the parameters of NTFS installed on your specific system. The size of your hard drive (or partition) determines how much space is taken by NTFS for keeping track of all the data, as well as the cluster size and all that other file system stuff. In CDFS, the info used for not losing data is built into the cluster for the data itself. In other words, a CDFS cluster keeps track of itself. So when you move data from a CD to a dynamic file system like NTFS, the overhead in NTFS is smaller, and thus your data appears to take up less room. Make no mistake, however, in thinking that the extra overhead is gone - it's just moved. It moved to the NTFS data area at the beginning of your hard drive...

Paul Komski
08-29-2005, 03:54 AM
Compare the files' checksum values using CDcheck from http://www.elpros.si/CDCheck/

If the files are the same (regardless of cluster sizes) they will have the same checksums.

Dinosaur
08-29-2005, 04:23 AM
Further investigation indicated that some files were not copied properly, probably due to transitory errors on the CD. While there might be differences in allocation cluster sizes and other overhead between a CD and a hard disk, research indicates that the Ztree utility reports the exact number of bytes of real data, ignoring unused parts of clusters and other overhead. When it reports a mismatch, something is wrong.

I think that a mass copy of files from a CD with problems is more likely to result in errors or other problems that a copy of a few files or a copy of individual files.

My Ztree utility has a Branch compare which can identify and tag identical files. Using it, I was able to find the erroneously copied files and recopy them individually.

BTW: I recommend Ztree. For use in addition to Windows Explorer, not as a replacement. There is a learning curve, but it is worth the effort.