The power-safe file system is a reliable disk file system that can withstand power failures without losing or corrupting data. It was designed for and is intended for traditional rotating hard disk drive media. This file system is supported by the fs-qnx6.so shared object.
Although existing disk file systems are designed to be robust and reliable, there's still the possibility of losing data, depending on what the file system is doing when a catastrophic failure (such as a power failure) occurs. For example:
You can get hard drives that offer atomic sector upgrades and promise you that either all of the old or new data in the sector is readable, but these drives are rare and expensive.
To address the problems associated with existing disk file systems, the power-safe file system never overwrites live data; it does all updates using copy-on-write (COW), assembling a new view of the file system in unused blocks on the disk. The new view of the file system becomes live only when all the updates are safely written on the disk. Everything is COW: both metadata and user data are protected. To see how this works, let's consider how the data is stored. A power-safe file system is divided into logical blocks, the size of which you can specify when you use mkqnx6fs to format the file system. Each inode includes 16 pointers to blocks. If the file is smaller than 16 blocks, the inode points to the data blocks directly. If the file is any bigger, those 16 blocks become pointers to more blocks, and so on.
The final block pointers to the real data are all in the leaves and are all at the same level. In some other file systems—such as EXT2—a file always has some direct blocks, some indirect ones, and some double indirect, so you go to different levels to get to different parts of the file. With the power-safe file system, all the user data for a file is at the same level.
If you change some data, it's written in one or more unused blocks, and the original data remains unchanged. The list of indirect block pointers must be modified to refer to the newly used blocks, but again the file system copies the existing block of pointers and modifies the copy. The file system then updates the inode again by modifying a copy—to refer to the new block of indirect pointers. When the operation is complete, the original data and the pointers to it remain intact, but there's a new set of blocks, indirect pointers, and inode for the modified data:
This has several implications for the COW file system:
A superblock is a global root block that contains the inodes for the system bitmap and inodes files. A power-safe file system maintains two superblocks:
The working superblock can include pointers to blocks in the stable superblock. These blocks contain data that hasn't yet been modified. The inodes and bitmap for the working superblock grow from it.
A snapshot is a consistent view of the file system (simply a committed superblock). To take a snapshot, the file system:
To mount the disk at startup, the file system simply reads the superblocks from disk, validates their CRCs, and then chooses the one with the higher sequence number. There's no need to run chkfsys or replay a transaction log. The time it takes to mount the file system is the time it takes to read a couple of blocks.
The copy-on-write (COW) method has some drawbacks:
However:
The performance of the file system depends on how much buffer cache is available, and on the frequency of the snapshots. Snapshots occur periodically (every 10 seconds, or as specified by the snapshot option to fs-qnx6.so ), and also when you call sync() for the entire file system, or fsync() for a single file.
You can also turn snapshots off if you're doing some long operation, and the intermediate states aren't useful to you. For example, suppose you are copying a very large file into a power-safe file system. The cp utility is really just a sequence of basic operations:
If the file is big enough so that copying it spans snapshots, you have on-disk views that include the file not existing, the file existing at a variety of sizes, and finally the complete file copied and its IDs and permissions set:
Each snapshot is a valid point-in-time view of the file system (that is, if you have copied 50 MB, the size is 50 MB, and all data up to 50 MB is also correctly copied and available). If there's a power failure, the file system is restored to the most recent snapshot. But the file system has no concept that the sequence of open(), write(), and close() operations is really one higher-level operation, cp. If you want the higher-level semantics, disable the snapshots around the cp, and then the middle snapshots won't happen, and if a power failure occurs, the file is either complete, or not there at all.
For information about using this file system, see Power-safe file system.