Transport layer

The transport layer has two major jobs: it must subdivide user-sized data buffers into network layer-sized datagrams, and it must enforce any desired transmission control such as reliable delivery. Two transport protocols that sit on top of IP are the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP), which offer different delivery guarantees.

TCP and UDP

TCP is best known as the first half of TCP/IP; as discussed in this and the preceding sections, the acronyms refer to two distinct services. TCP provides reliable, sequenced delivery of packets. It is ideally suited for connection-oriented communication, such as a remote login or a file transfer. Missing packets during a login session is both frustrating and dangerous -- what happens if rm *.o gets truncated to rm * ? TCP-based services are generally geared toward long-lived network connections, and TCP is used in any case when ordered datagram delivery is a requirement. There is overhead in TCP for keeping track of packet delivery order and the parts of the data stream that must be resent. This is state information. It's not part of the data stream, but rather describes the state of the connection and the data transfer. Maintaining this information for each connection makes TCP an inherently stateful protocol. Because there is state, TCP can adapt its data flow rate when the network is congested.UDP is a no-frills transport protocol: it sends large datagrams to a remote host, but it makes no assurances about their delivery or the order in which they are delivered. UDP is best for connectionless communication on local area networks in which no context is needed to send packets to a remote host and there is no concern about congestion. Broadcast-oriented services use UDP, as do those in which repeated, out of sequence, or missed requests have no harmful side effects.Reliable and unreliable delivery is the primary distinction between TCP and UDP. TCP will always try to replace a packet that gets lost on the network, but UDP does not. UDP packets can arrive in any order. If there is a network bottleneck that drops packets, UDP packets may not arrive at all. It's up to the application built on UDP to determine that a packet was lost, and to resend it if necessary. The state maintained by TCP has a fixed cost associated with it, making UDP a faster protocol on low-latency, high-bandwidth links. The price paid for speed (in UDP) is unreliability and added complexity to the higher level applications that must handle lost packets.

Port numbers

A host may have many TCP and UDP connections at any time. Connections to a host are distinguished by a port number, which serves as a sort of mailbox number for incoming datagrams. There may be many processes using TCP and UDP on a single machine, and the port numbers distinguish these processes for incoming packets. When a user program opens a TCP or UDP socket, it gets connected to a port on the local host. The application may specify the port, usually when trying to reach some service with a well-defined port number, or it may allow the operating system to fill in the port number with the next available free port number.When a packet is received and passed to the TCP or UDP handler, it gets directed to the interested user process on the basis of the destination port number in the packet. The quadruple of:

source IP address, source port, destination IP address, destination port

uniquely identifies every interhost connection in the network. While many processes may be talking to the process that handles remote login requests (therefore their packets have the same destination IP addresses and port numbers), they will have unique pairs of source IP addresses and port numbers. The destination port number determines which of the many processes using TCP or UDP gets the data.On most Unix systems port numbers below 1024 are reserved for the processes executing with superuser privileges, while ports 1024 and above may be used by any user. This enforces some measure of security by preventing random user applications from accessing ports used by servers. However, given that most nodes on the network don't run Unix, this measure of security is very questionable.