Creating Alternate Names for a File: Linking

As if one name for a file weren't enough, sometimes you want to have two, three, or a dozen names for the same file. This operation of creating alternate names for a file is called linking. The two major forms of linking are hard links and symbolic links (also called symlinks or soft links). Not all kinds of filesystems support both of these or even either of them. This section describes filesystems under POSIX.

About Hard and Soft Links

A hard link to a file is indistinguishable from the original name for the file; there's no particular link that is more the "real name" for the file than any other.

The operating system keeps track of how many hard links reference the file at any particular time. When a file is first created, it starts with one link. Each new hard link increases the count. Each removed link reduces the count. When the last link to a file disappears, and the file is closed, the file goes away.

Every hard link to a file must reside on the same mounted filesystem (usually a disk or a part of a disk). Because of this, you cannot make a new hard link to a file that is on a different mounted filesystem.

Under most systems, hard links are also restricted for directories. To keep the directory structure as a tree rather than an arbitrary mish-mash, a directory is allowed only one name from the root, a link from the dot file within itself, and a bunch of dot-dot hard links from each of its subdirectories. If you try to create another hard link to a directory, you will get an error (unless you're the superuser, and then you get to spend all night restoring your mangled filesystem).

A symbolic link is a special kind of a file that contains a pathname as data. When this file is opened, the operating system regards its contents as replacement characters for the pathname, causing the kernel to hunt through the directory tree some more, starting with the new name.

For example, if a symlink named fred contains the name barney, opening fred is really an indication to open barney. If barney is a directory, then fred/wilma refers to barney/wilma instead.

The contents of a symlink (where a symlink points) do not have to refer to an existing file or directory. When fred is made, barney doesn't even have to exist: in fact, it may never exist! The contents of a symlink can refer to a path that leads you off the current filesystem, so you can create a symlink to a file on another mounted filesystem.

While following the new name, the kernel may run across another symlink. This new symlink gives even more new parts to the path to be followed. In fact, symlinks can point to other symlinks, with usually at least eight levels of symlinks allowed, although this is rarely used in practice.

A hard link protects the contents of a file from being lost (because it counts as one of the names of the file). A symlink cannot keep the contents from disappearing. A symlink can cross mounted filesystems; a hard link cannot. Only a symlink can be made to a directory.

Creating Hard and Soft Links with Perl

The UNIX ln command creates hard links. The command

ln fred bigdumbguy

creates a hard link from the file fred (which must exist) to bigdumbguy. In Perl, this is expressed as:

link("fred","bigdumbguy") || die "cannot link fred to bigdumbguy";

The link function takes two parameters, the old filename and a new alias for that file. The function returns true if the link was successful. As with the mv command, the UNIX ln command performs some behind-the-scenes magic, allowing you to specify the target directory for the new alias without naming the file within the directory. The link function (like the rename function) is not so smart, and you must specify the full filename explicitly.

For a hard link, the old filename cannot be a directory,[] and the new alias must be on the same filesystem. (These restrictions are part of the reason that symbolic links were created.)

[1] Unless you are root and enjoy running fsck.

On systems that support symbolic links, the ln command may be given the -s option to create a symbolic link. So, to create a symbolic link from barney to neighbor (so that a reference to neighbor is actually a reference to barney), you'd use something like this:

ln -s barney neighbor

and in Perl, you'd use the symlink function, like so:

symlink("barney","neighbor") || die "cannot symlink to neighbor";

Note that barney need not exist (poor Betty!), either now or in the future. In this case, a reference to neighbor will return something vaguely like No such file or directory.

When you invoke ls -l on the directory containing a symbolic link, you get an indication of both the name of the symbolic link and where the link points. Perl gives you this same information through the readlink function, which works surprisingly like the system call of the same name, returning the name pointed at by the specified symbolic link. So, this operation

if (defined($x = readlink("neighbor"))) {
 print "neighbor points at '$x'\n";
}

should talk about barney if all is well. If the selected symbolic link does not exist or can't be read or isn't even a symlink, readlink returns undef (definitely false), which is why we're testing it here.

On systems without symbolic links, both the symlink and readlink functions will fail, producing a run-time error. This is because there is no comparable equivalent for symbolic links on systems that don't support them. Perl can hide some system-dependent features from you, but some just leak right through. This is one of them.