The Gnutella Protocol A Non-Technical Introduction: The Gnutella network is is a form of a distributed file sharing system. That is, each servent connected to the network is in thoery considered equal. In pseudo-distributed file sharing systems such as Napster or Scour Exchange, each client connects to one or more central servers. With Gnutella networks there are no centralised servers. Each client also functions as a server. This way, the network becomes much more immune to shutdown or regulation. Technical Theory of Operation: The Gnutella network is a collection of Gnutella servents that cooperate in maintaining the network. A broadcast packet on the Gnutella network begins its life at a single servent, and is broadcasted to all connected servents. These servents then rebroadcast to all connected servents. This continues until the time to live of the packet expires. If all servents have eight other servents connected, one broadcast packet with a time to live of 7 can make it to 8^7 servents, which is 2097152. This is far more than enough to reach all the servents on the Gnutella network. A reply packet on the Gnutella network begins its life as a response to a broadcast packet. It is forwarded back to the servent where its initating broadcast came from until it gets back to the servent that sent off the broadcast. To keep track of where a packet came from, each packet is prefixed with a 16 byte Message ID. The Message ID is simply random data. A servent uses the same Message ID for all of its own broadcast packets. Each servent keeps a hash table of the most recent few thousand packets it has received. The hash table matches the Message ID with the IP address the message came from. To route a reply packet back where it came from, a servent checks its hash table for the Message ID, and sends it back to the IP address the Message ID is matched to. This continues until the packet gets back home. This results in a network with no hierarchy, every servent is equal. In order to be part of the network, one must contribute to the network. However, some servents are more equal than others. servents running on faster internet connections are more suited to hub (maintain more connections) than others, and therefore get responses from the network much faster. Each Gnutella server only knows about the servers that it is directly connected to. All other servers are invisible, unless they announce themselves by answering to a Ping or by replying to a Query. This provides some anonymity. Unfortunately, the combination of having no hierarchy and the lack of a definitive source for a server list means that the network is not easily described. It is not a tree (since there is no hierarchy) and it is cyclic. Being cyclic means there is a lot of needless network traffic. Line-ending Notes: The most common way to end a line on the internet is to use a carriage return and line feed, or "\r\n". HTTP uses this, though the standards say the server must support other ways to end lines. Gnutella uses "\n\n" in connecting and push uploads, which is a different way of doing things. Endianness Notes: Endianness is another way of saying byte-ordering. The byte ordering of data is in which direction the value of the bytes gets more significant. The most common byte ordering on the internet is network byte order, which is big endian. Gnutella uses little endian for all values except the IPv4 address in pongs, hits, and push requests (internet addresses normally *stay* in network byte order). Most internet protocols would use network byte ordering for all binary data, so Gnutella is different. Connecting: The initiator opens a TCP connection. The initiator sends "GNUTELLA CONNECT/0.4\n\n". The receiver sends "GNUTELLA OK\n\n". After this, it's all packets. Header: Bytes 0 - 15: Message ID: A Message ID is generated on the client for each new message it creates. Sixteen bytes of random data. Byte 16: Function ID: What message type the packet is. See the list of function types below for descriptions of the types. An 8 bit unsigned integer (byte order is irrelevant with one byte). Byte 17: TTL Remaining: How many hops the packet has left before it should be dropped. An 8 bit unsigned integer (byte order is irrelevant with one byte). Byte 18: Hops taken: How many hops this packet has already taken. Set the TTL on response messages to this value. An 8 bit usnigned integer (byte order is irrelevant with one byte). Bytes 19 - 22: Data Length: The length of the Function-dependant data which follows. This is an 32 bit unsigned integer in little-endian byte order, which is the opposite of network byte order. List of Functions: * 0: Ping * 1: Pong (Ping Response) * 64: Push Request * 128: Query * 129: Hits (Query Response) Ping: A Ping has no body. Routing: Rebroadcast packet through every available connection, except the one from which it was received. Pong (Ping Response): Bytes 0 - 1: Servent port: The TCP port number of the listening servent. A 16 bit unsigned integer in little-endian byte order. Bytes 2 - 5: Servent IP: The IP address of the listening servent. A 32 bit unsigned integer in network byte order. Bytes 6 - 9: File Count: The number of files shared by the servent. A 32 bit unsigned integer in little-endian byte order. Bytes 10 - 14: Total Files Size The total size of the files shared by the servent in kiB (1024 bytes). A 32 bit unsigned integer in little-endian byte order. Routing: Forward packet only through the connection from which the Ping came. Query: Bytes 0 - 1: Minimum Speed: The minimum speed of servents which should perform the search and send results. A 16 bit unsigned integer in little-endian byte order. Bytes 2 +: Search String: A NUL zero terminated character string which contains the search request. Routing: Rebroadcast packet through every available connection, except the one it was received from. Hits (Query Reply): Byte 0: Number of Items: The number Hit Items (see below) which follow this header. An 8 bit unsigned integer (byte order is irrelevant with one byte). Bytes 1 - 2: Servent Port: The listening port number of the servent which found the results. A 16 bit unsigned integer in little-endian byte order. Bytes 3 - 6: Servent IP: The IP address of the servent which found the results. A 32 bit unsigned integer in network byte order. Bytes 7 - 8: Servent Speed: The speed of the servent which found the results. A 16 bit unsigned integer in little-endian byte order. Bytes 9 - 10: Unknown: Unknown. Bytes 11 +: List of Items: A Hits Item (see below) for each result found. Last 16 Bytes: Response ID: The Response ID of the servent which found the results. Sixteen bytes of random data. Routing: Forward packet only through the connection from which the Query came. Push Request: Bytes 0 - 15: Response ID: The Response ID of the server from which requester wishes to receive a push. Bytes 16 - 19: File Index: The File Index of file requested. See Hit Items for more info. A 32 bit unsigned integer in little-endian byte order. Bytes 20 - 23: Requester IP: The IP address of the servent requesting the push. A 32 bit unsigned integer in network byte order. Bytes 24 - 25: Requester Port: The Port number of the servent requesting the push. A 16 bit unsigned integer in little-endian byte order. Routing: Forward packet only through the connection from which the Hits came. Hits Items: Bytes 0 - 3: File Index: Each file shared by a servant has an integer value associated with it. A 32 bit unsigned integer in little-endian byte order. Bytes 4 - 7: File Size: The size of the file in octets. A 32 bit unsigned integer in little-endian byte order. Bytes 8 +: Pathname: The pathname of the found file. The pathname is double NUL zero terminated. Downloading: Downloading is done by HTTP. A GET request is sent, with a URI that is constructed from the information in a Search Reply. The URI starts with /get/, then the File Index number (see Search Reply Items), then the filename. Example download request: GET /get/1234/strawberry-rhubarb-pies.rcp HTTP/1.0\r\n Connection: Keep-Alive\r\n Range: bytes=0-\r\n \r\n The server should respond with proper normal HTTP headers, then the file. HTTP/1.0 200 OK\r\n Server: Foo-Gnutella\r\n Content-type: application/binary\r\n Content-length: 948\r\n \r\n Uploading: Uploading is done in response to a Push Request. The uploader establishes a TCP connection, and sends GIV, then the File Index number, a colon, the Response ID of the uploader, a slash, the filename, and finally two newlines. Example: GIV 1234:abcdefghijklmnop/Strawberry_Rhubarb_Pie.txt\n\n The downloader then sends a GET request as if it had been trying to establish an HTTP connection all along. Resume may be done normally with the Range: header.