33

As I remember, DOS had to update a backup copy of the FAT with each disk write, which I imagine would have greatly slowed write speed. Yet I don't recall any recovery software that used the second copy. Were there any DOS sys utilities that did? Did Norton?

What was the first version of DOS that used a backup FAT? Did floppies have it?

Pic cred

The only time I ever heard the second FAT mentioned, was just that it existed.

6
  • 11
    The FAT surely doesn't need to be updated with each disk write, only with each allocation or truncation. Commented Nov 24 at 15:02
  • 6
    Per Wikipedia, the second FAT was used on a read error from the first FAT -- this would have been handled by the file system code within DOS. Presumably any recovery program that accessed the raw disk could (should) adopt the same strategy. It doesn't seem like a major selling point for such programs. Commented Nov 24 at 15:11
  • 1
    Every standard on-disk format comes with exactly 2 copies of the FAT, so much so that IIRC some DOS versions don't support file systems with other amounts of them (either on diskettes or hard disks). The one well-known example of single-FAT file systems is RAM drives. Commented Nov 24 at 17:44
  • 1
    @dave: But surely the vast majority of writes would either create a new file or extend an existing one, and thus need to allocate FAT entries? Especially if your program buffers its output and does writes of at least the cluster size (often 1 sector). Rewriting the contents of an existing file has relatively few use cases. Commented Nov 25 at 14:58
  • 8
    I happened to use the 2nd FAT when recovering someone's disk half a year ago, when the rest of the disk was OK but the primary FAT had gotten corrupted. Strangely, many tools did not even attempt to use it – had to use fatcat to manually load it over the broken primary FAT. Commented Nov 25 at 19:43

1 Answer 1

34

The second copy of the FAT is used by disk check programs.

The idea is simple, if a write to the first copy of the FAT is interrupted by power loss, reading it results in a CRC error. It can be recovered by reading the corresponding sector from the second copy and writing it to the first.

This is listed as broken in the Windows ME version of scandisk, which means it used to work. Some source: http://www.multiboot.ru/msdos8.htm

I've pulled apart dosfsck and freedos's chkdsk and they both have failed to implement this.

To make matters worse, the behavior of SSDs isn't to return a CRC error on failed write due to power loss, but to return all 1 bits. Which means everybody's disk check tools are broken now.

14
  • 2
    Don’t most SSDs contain caps that hold enough energy for them to be able to finish a write that’s already in progress if there’s a power loss? Commented Nov 24 at 16:00
  • 2
    @TeaRex: Do they? I've seen the fill bits result. Commented Nov 24 at 16:04
  • 13
    @TeaRex: The feature you mean is called Power Loss Protection and is by far not universal. It is mostly absent in consumer SSDs and spotty even in entry-level enterprise SSDs. However, higher-tier enterprise SSDs typically tend to have PLP almost universally. The problem isn't so much an in-progress write to flash but the rather large DRAM caches. SSDs with PLP can have much higher performance for synchronous writes since they can signal completion as soon as the data is in DRAM, whereas an SSD without PLP either has to sacrifice sync write performance … or just plain lie. Commented Nov 24 at 17:07
  • 6
    The most reasonable content a modern ssd can return from a failed (for whatever reason) write is the previous content of the sector. It does not write in the same place anyway (in contrast with hdds) and the previous content is the least wrong option (compared to, eg all 1s, all 0s or a read error). This is also one of the easiest options to implement in the wear leveling layer. Commented Nov 25 at 15:14
  • 3
    Oh, you're talking about DOS system-call error codes, where 23 is "Data error (CRC)". The DOS equivalent of Unix EIO, when the physical media signals an I/O error. (And floppy disks use CRC, hence the error name. Modern media uses error-correction codes of course, not just detection codes like CRC, but can still detect most uncorrectable errors.) So what you actually mean is that MS-DOS was designed around power-loss writes producing bad sectors with noisy failure, rather than silent bad or stale data. Very unclear to just say "CRC error"; not being steeped in DOS, I had no idea what... Commented Nov 26 at 5:54

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.