Preface
Twenty years ago, most computer centers had a few large computers shared by several hundred users. The "computing environment" was usually a room containing dozens of terminals. All users worked in the same place, with one set of disks, one user account information file, and one view of all resources. Today, local area networks have made terminal rooms much less common. Now, a "computing environment" almost always refers to distributed computing, where users have personal desktop machines, and shared resources are provided by special-purpose systems such as file, computer, and print servers. Each desktop requires redundant configuration files, including user information, network host addresses, and local and shared remote filesystem information. A mechanism to provide consistent access to all files and configuration information ensures that all users have access to the "right" machines, and that once they have logged in they will see a set of files that is both familiar and complete. This consistency must be provided in a way that is transparent to the users; that is, a user should not know that a filesystem is located on a remote fileserver. The transparent view of resources must be consistent across all machines and also consistent with the way things work in a non-networked environment. In a networked computing environment, it's usually up to the system administrator to manage the machines on the network (including centralized servers) as well as the network itself. Managing the network means ensuring that the network is transparent to users rather than an impediment to their work. The Network File System (NFS) and the Network Information Service (NIS)[1] provide mechanisms for solving "consistent and transparent" access problems. The NFS and NIS protocols were developed by Oracle and are now licensed to hundreds of vendors and universities, not to mention dozens of implementations from the published NFS and NFS specifications. NIS centralizes commonly replicated configuration files, such as the password file, on a single host. It eliminates duplicate copies of user and system information and allows the system administrator to make changes from one place. NFS makes remote filesystems appear to be local, as if they were on disks attached to the local host. With NFS, all machines can share a single set of files, eliminating duplicate copies of files on different machines in the network. Using NFS and NIS together greatly simplifies the management of various combinations of machines, users, and filesystems.
[1]NIS was formerly called the "Yellow Pages." While many commands and directory names retain the yp prefix, the formal name of the set of services has been changed to avoid conflicting with registered trademarks.
NFS provides network and filesystem transparency because it hides the actual, physical location of the filesystem. A user's files could be on a local disk, on a shared disk on a fileserver, or even on a machine located across a wide-area network. As a user, you're most content when you see the same files on all machines. Just having the files available, though, doesn't mean that you can access them if your user information isn't correct. Missing or inconsistent user and group information will break Unix file permission checking. This is where NIS complements NFS, by adding consistency to the information used to build and describe the shared filesystems. A user can sit down in front of any workstation in his or her group that is running NIS and be reasonably assured that he or she can log in, find his or her home directory, and access tools such as compilers, window systems, and publishing packages. In addition to making life easier for the users, NFS and NIS simplify the tasks of system administrators, by centralizing the management of both configuration information and disk resources. NFS can be used to create very complex filesystems, taking components from many different servers on the network. It is possible to overwhelm users by providing "everything everywhere," so simplicity should rule network design. Just as a database developer constructs views of a database to present only the relevant fields to an application, the user community should see a logical collection of files, user account information, and system services from each viewpoint in the computing environment. Simplicity often satisfies the largest number of users, and it makes the system administrator's job easier.
Who this tutorial is for
This tutorial is of interest to system administrators and network managers who are installing or planning new NFS and NIS networks, or debugging and tuning existing networks and servers. It is also aimed at the network user who is interested in the mechanics that hold the network together. We'll assume that you are familiar with the basics of Unix system administration and TCP/IP networking. Terms that are commonly misused or particular to a discussion will be defined as needed. Where appropriate, an explanation of a low-level phenomenon, such as Ethernet congestion will be provided if it is important to a more general discussion such as NFS performance on a congested network. Models for these phenomena will be drawn from everyday examples rather than their more rigorous mathematical and statistical roots. This tutorial focuses on the way NFS and NIS work, and how to use them to solve common problems in a distributed computing environment. Because Oracle developed and continues to innovate NFS and NIS, this tutorial uses Sun's Solaris operating system as the frame of reference. Thus if you are administering NFS on non-Solaris systems, you should use this tutorial in conjunction with your vendor's documentation, since utilities and their options will vary by implementation and release. This tutorial explains what the configuration files and utilities do, and how their options affect performance and system administration issues. By walking through the steps comprising a complex operation or by detailing each step in the debugging process, we hope to shed light on techniques for effective management of distributed computing environments. There are very few absolute constraints or thresholds that are universally applicable, so we refrain from stating them. This tutorial should help you to determine the fair utilization and performance constraints for your network.