Fundamentals of Networking - Game Dev - Java Programming Language

Protocols

One of the first things that you must consider with networking is that you may be communicating with native operating systems. For example, if you have your server on a computer with the Linux operating system installed, you would want clients using, say, Windows and Macs to also be able to access your server. To accomplish this, the operating systems all need to use the same data transmission language. This is achieved by using protocols. A protocol is simply a standard of how data should be transferred across a network. Although there are many different protocols, we will focus on TCP/IP, which is the most common protocol on the Internet. The name of the protocol is in fact a little misleading, however, in that there are actually two different protocols available under TCP/IP. These are TCP, which stands for Transmission Control Protocol, and UDP, which stands for User Datagram Protocol. Let's now look at the differences between these two protocols.

TCP: Transmission Control Protocol

When using the TCP protocol in networking, you are first required to create a connection to another computer. This may seem obvious, but not all protocols require a connection, as we will see in the next section with the UDP protocol. Once a connection is established, you can then use incoming and outgoing streams to send and receive data over the network. The main advantage of using the TCP protocol is that it guarantees delivery of your data (in the correct order) and handles duplicate packets. TCP also has congestion control and flow control mechanisms, which are useful when streaming lots of data. When sending data with TCP, there are many things that are done to the data before it is sent. First, TCP adds extra headers to the data and may split it up into many different packets, etc. All this is important if the data must be optimized as small as possible. It is quite a waste if a game sends one-byte data packets with TCP. In addition, if too large an amount of data is put into one package, it can be inefficient.

Note

A packet is simply a unit of data that is sent over a network.

UDP: User Datagram Protocol

UDP can be described as a connectionless protocol, as you do not actually create a connection to the remote computer. With UDP, you simply specify where the information is going to go, and you never know if it gets there or not. This makes UDP an unreliable protocol, as it can easily lose packets and create duplicates. This sounds terrible, doesn't it? The advantage of UDP over TCP is that it can be much more efficient. For example, the TCP protocol has flow control built into it, which limits the initial bandwidth of the network connection to alleviate network congestion, whereas UDP has no such thing, meaning we get full available bandwidth. In addition, we can handle lost packets by adding our own simple notification message to determine if it has sent correctly or not. However, adding too much error checking can make UDP not any better than TCP for efficiency.

IP Addresses

An IP address is a way that you can identify computers on a network (or the Internet). If you have Internet access via a modem or cable (or on a local area network), you can find your IP address by going to the command prompt in Windows and typing:

ipconfig

When you do this, you will see something similar to the following figure. (Note that you may see two IP addresses if you are also connected to a local area network.)
Screenshot-1: Finding out your IP address

If you have a dial-up connection to the Internet, it is likely that you will be assigned a new IP address dynamically each time you connect to the Internet. However, if you are lucky enough to have a cable connection, you will be assigned a static IP address.

So we now know how to find out IP addresses; let's see what they actually are. Currently, IP addresses consist of a 32-bit number, which is broken down into four bytes in the form x.x.x.x, where "x" is a single byte. Looking at the previous image, the IP address is 192.168.0.133. Note that the way IP addresses are being represented is being revised. The current 32-bit system is known as IPv4, but the new system will represent IP addresses by means of a 128-bit number, which will be called IPv6. More information on this new standard can be found at the following web site: http://www.ipv6.org/.

Ports

We now know computers can be distinguished from each other over a network via IP addresses, but what if there are several server apps running on a single computer? How do you determine the server for which the network message is intended? The answer to this is ports. A port isn't actually a physical thing but is simply a 16-bit value. The operating system keeps track of which ports are in use and which are not. The first 1 to 1023 ports are reserved by the system for common services (such as FTP, which runs on port 21). This leaves ports 1024 to 65535 free for us to use in our apps. Note that there is no such thing as port 0.

Note

There is a body known as IANA (Internet Assigned Numbers Authority), which records well-known used ports. For more information on this, see the following web page: http://www.iana.org/.

Sockets

As IP addresses and ports are used to uniquely identify machines and servers, a socket is used to establish connections and send data between machines. The best way to think of a socket is as a pipe through which data can flow between two machines on a network. There are two major types of sockets that we are interested in: stream sockets and datagram sockets.

Stream and Datagram Sockets

A stream socket is used with the TCP protocol, and as you know from before, TCP requires a connection to the remote machine before data can be sent. When a connection is established, we use a stream socket to obtain either an output or input stream (or both) for the connection so we can easily send and receive data via the streams.

A datagram socket is different in that it does not have any streams associated with it. It works by sending packets of information that also contain information regarding where the packet came from. By using this method, it is then possible to reply to the message by using the information that was contained in the packet regarding where it came from.