Software for Backups

There are a number of software packages that allow you to perform backups. Some are vendor specific, and others are quite commonly available. Each may have particular benefits in a particular environment. We'll outline a few of the more common ones here, including a few that you might not otherwise consider. You should consult your local documentation to see if there are special programs available with your system.

Beware Backing up Files with Holes

Standard UNIX files are direct-access files; in other words, you can specify an offset from the beginning of the file, and then read and write from that location. If you ever had experience with older mainframe systems that only allowed files to be accessed sequentially, you know how important random access is for many things, including building random-access databases.

An interesting case occurs when a program references beyond the "end" of the file and then writes. What goes into the space between the old end-of-file and the data just now written? Zero-filled bytes would seem to be appropriate, as there is really nothing there.

Now, consider that the span could be millions of bytes long, and there is really nothing there. If UNIX were to allocate disk blocks for all that space, it could possibly exhaust the free space available. Instead, values are set internal to the inode and file data pointers so that only blocks needed to hold written data are allocated. The remaining span represents a hole that UNIX remembers. Attempts to read any of those blocks simply return zero values. Attempts to write any location in the hole results in a real disk block being allocated and written, so everything continues to appear normal. (One way to identify these files is to compare the size reported by ls -l with the size reported by ls -s.)

Small files with large holes can be a serious concern to backup software, depending on how your software handles them. Simple copy programs will try to read the file sequentially, and the result is a stream with lots of zero bytes. When copied into a new file, blocks are actually allocated for the whole span and lots of space may be wasted. More intelligent programs, like dump, bypass the normal file system and read the actual inode and set of data pointers. Such programs only save and restore the actual blocks allocated, thus saving both tape and file storage.

Keep these comments in mind if you try to copy or archive a file that appears to be larger in size than the disk it resides in. Copying a file with holes to another device can cause you to suddenly run out of disk space.

Simple Local Copies

The simplest form of backup is to make simple copies of your files and directories. You might make those copies to local disk, to removable disk, to tape, or to some other media. Some file copy programs will properly duplicate modification and access times, and copy owner and protection information, if you are super-user or the files belong to you. They seldom recreate links, however. Examples include:

Simple Archives

There are several programs that are available to make simple archives packed into disk files or onto tape. These are usually capable of storing all directory information about a file, and restoring much of it if the correct options are used. Running these programs may result in a change of either (or both) the atime and the ctime of items archived, however.[8]

[8] See The UNIX Filesystem, for information about these file characteristics.

Specialized Backup Programs

There are several dedicated backup programs.

Encrypting Your Backups

You can improvise your own backup encryption if you have an encryption program that can be used as a filter and you use a backup program that can write to a file, such as the dump, cpio, or tar commands. For example, to make an encrypted tape archive using the tar command and the des encryption program, you might use the following command:

# tar cf - dirs and files | des -ef | dd bs=10240 of=/dev/rm8

Although software encryption has potential drawbacks (for example, the software encryption program can be compromised so it records all passwords), this method is certainly preferable to storing sensitive information on unencrypted backup.

Here is an example: suppose you have a des encryption program called des which prompts the user for a key and then encrypts its standard input to standard output.[9] You could use this program with the dump (called ufsdump under Solaris) program to back up the file system /u to the device /dev/rmt8 with the command:

[9] Some versions of the des command require that you specify the "-f -" option to make the program run as a filter.



# dump f - /u | des -e | dd bs=10240 of=/dev/rmt8 Enter key: 

If you wanted to back up the filesystem with tar, you would instead use the command:

# tar cf - /u | des -e | dd bs=10240 of=/dev/rmt8  Enter key:

To read these files back, you would use the following command sequences:

# dd bs=10240 if=/dev/rmt8 | des -d | restore fi -  Enter key:

and:

# dd bs=10240 if=/dev/rmt8 | des -d | tar xpBfv -  Enter key:

In both of these examples, the backup programs are instructed to send the backup of file systems to standard output. The output is then encrypted and written to the tape drive.

NOTE: If you encrypt the backup of a filesystem and you forget the encryption key, the information stored on the backup will be unusable.

Backups Across the Net

A few programs can be used to do backups across a network link. Thus, you can do backups on one machine, and write the results to another. An obvious example would be using a program that can write to stdout, and then piping the output to a remote shell. Some programs are better integrated with networks, however.

Commercial Offerings

There are several commercial backup and restore utilities. Several of them feature special options that make indexing files or staging little-used files to slower storage (such as write-once optical media) easier. Unfortunately, lack of portability across multiple platforms, and compatibility with sites that may not have the software installed, might be drawbacks for many users. Be sure to fully evaluate the conditions under which you'll need to use the program and decide on a backup strategy before purchasing the software.

inode Modification Times

Most backup programs check the access and modification times on files and directories to determine which entries need to be stored to the archive. Thus, you can force an entry to be included (or not included) by altering these times. The touch command enables you to do so quickly and efficiently.

However, many programs that do backups will cause the access time on files and directories to be updated when they are read for the backup. As this behavior might break other software that depends on the access times, these programs sometimes use the utime system call to reset the access time back to the value it had prior to the backup.

Unfortunately, using the ctime () system call will cause the inode change time, the ctime, to be altered. There is no filesystem call to set the ctime back to what it was, so the ctime remains altered. This is a bane to system security investigations, because it wipes out an important piece of information about files that may have been altered by an intruder.

For this reason, we suggest that you determine the behavior in this regard by any candidate backup program and choose one that does not alter file times. When considering a commercial backup system (or when designing your own), it is wise to avoid a system that changes the ctime or atime stored in the inode.

If you cannot use a backup system that directly accesses the raw disk partitions, you have two other choices:

  1. You can unmount your disks and remount them read-only before backing them up. This procedure will allow you to use programs such as cpio or tar without changing the atime.
  2. If your system supports NFS loopback mounts (such as Solaris or SunOS), you can create a read-only NFS loopback mount for each disk. Then you can back up the NFS-mounted disk, rather than the real device.