NFS client problems
Using the output of nfsstat -c, look for the following symptoms:
- timeout > 5%
- The client's RPC requests are timing out before the server can answer them, or the requests are not reaching the server. Check badxids to determine the cause of the timeouts.
- badxids ~ timeout
- RPC requests that have been retransmitted are being handled by the server, and the client is receiving duplicate replies. Increase the timeo parameter for this NFS mount to alleviate the request retransmission, or tune the server to reduce the average request service time.
- badxids ~ 0
- With a large timeout count, this indicates that the network is dropping parts of NFS requests or replies in between the NFS client and server. Reduce the NFS buffer size using the rsize and wsize mount parameters to increase the probability that NFS buffers will transit the network intact.
- badcalls > 0
- RPC calls on soft-mounted filesystems are timing out. If a server has crashed, then badcalls can be expected to increase. But if badcalls grows during "normal" operation then soft-mounted filesystems should use a larger timeo or retrans value to prevent RPC failures. Better yet, mount the filesystem without the soft option.
- cantconn > 1%
- This indicates that the NFS client is having trouble making a TCP connection to the NFS server. Often this is because the NFS server has been or is down. It can also indicate that the connection queue length in the NFS server is too small, or that an attacker is attempting a denial of service attack on the server by clogging the connection queue. If you cannot eliminate connection queue length as a problem, then use the -l parameter to nfsd to increase the queue length.