I've started work on a program which utilizes the gnutella protocol (as of version 0.48). Basically, you connect a SOCK_STREAM (tcp) socket to any other gnutella server, send a GNUTELLA CONNECT/0.4[lf][lf] and expect back a GNUTELLA OK[lf][lf]. At this point the server expects you to identify yourself. You send a type 0x00 message to whoever you just connected to, and the server responds with how many files it is sharing, and the total size of those files (in KB). You'll also get a response from everybody connected to the machine you connect to, and so on, until the TTL expires on the message.
At this point, the server will start bombarding you with information about other servers (which fills the host catcher, and gnutellanet stats). You'll also get search requests. You're supposed to decrement the TTL and pass it on to any other servers you're connected to (if TTL > 0) If you have no matching files you can simply discard the packet, otherwise you should build a query response to that message and send it back from where it came.
The header is fixed for all message and ends with the size of the data area which follows. The header contains a Microsoft GUID (Globally Unique Identifier for you nonWinblows people) which is the message identifer. My crystal ball reports that "the GUIDs only have to be unique on the client", which means that you can really put anything here, as long as you keep track of it (a client won't respond to you if it sees the same message id again). If you're responding to a message, be sure you haven't seen the message id (from that host) before, copy their message ID into your response and send it on it's way. That message ID is followed by a function ID (one byte), which looks to be a bitmask. The function ID indicates what type to do with the packet (search request, search response, server info, etc). The next field is a byte TTL. Every packet you receive you should dec (or -- for the C guys) the TTL andpass the packet on if the TTL is still > 0 (i.e. if (--hdr.TTL) { [pass on] }, god I love C). You should also inc the hop count. Seems redundant? Well, some people have smaller TTLs, and you have the right to drop any message you want to based on its hop count. The header finishes up by telling us how large the function-dependant data that follows is.
Easy, just build a type 0x80 packet, add a WORD for the minimum connection speed(in kbps), then the null terminated string. There isn't a response from people who have no match, but a result will come back as a type 0x81 message. There will be a Search Response header followed by N Search Response Items and double NULL terminated filenames. To finish this up, there's a Search Response footer with the full 128 bit (16 byte) client ID of the server that found the result.
These are POC. If you want a file from a server, you connect to the server, and send an HTTP request for it. The URL is of the form /get/[file_id]/[filename]. The file id was returned with the search result. The gnutella HTTP server also supports resuming a transfer via the Content-range: HTTP header. If you're just curious, the User-Agent is gnutella. You can actually load up Netscape, and get a file from a Gutella server. Pretty cool, eh? Here's a dump of what a HTTP request looks like:
GET /get/293/rhubarb_pie.rcp HTTP/1.0 User-Agent: gnutella
Yes, the user-agent header and HTTP version are required. If the server is behind a firewall which does not allow incoming connections, the client can negotiate a push connection. This is a function ID 0x40 packet. It contains the ClientID128 (GUID) of the server, followed by the File ID requested, and the IP address and port of the client.
Awlright, some people have read this document and sill keep asking me things like "What's in the first 10 bytes of the header?", so for people who can't figure out how what the syntax of a Delphi record looks like, or can't read english so well, here's some nice tables:
Bytes 0 - 15: Message ID:
A message ID is generated on the client for each new message it creates. The 16 byte value is created with the Windows API call CoCreateGUID(), which in theory will generate a new globally unique value every time you call it. See the text above for a comment about this values uniqueness.
Byte 16: Function ID:
What message type the packet is. See the table of message types below for descriptions of the types.
Byte 17: TTL Remaining:
How many hops the packet has left before it should be dropped.
Byte 18: Hops taken:
How many hops this packet has already taken. Set the TTL on response messages to this value!
Bytes 19 - 22: Data Length:
The length of the Function-dependant data which follows. There has been some discussion as to if this value is actually only 2 bytes and the last 2 bytes are something else. Seems to work with 4 for me. Also there is a question as to signed or unsigned integers. Don't know that either, I can't get gnutella to try and send a 2^31 + 1 byte packet :).
0x00: Ping:
An empty message (datalen = 0) sent by a client requesting an 0x01 from everyone on the network. This message type should be responded to with a 0x01 and passed on.
0x01: Ping Response:
Sent in response to a 0x00, this message contains the host ip and port, how many files the host is sharing and their total size.
0x40: Client Push Request:
For servants behind a firewall,
where the client cannot reach the server directly, a push request message is sent, asking the server to connect out to the client and perform an upload.
0x80: Search:
This is a search message and contains the query string as well as the minimum speed.
0x81: Search Response:
These are results of a 0x80 search request it contains the IP address, port, and speed of the serverant,
followed by a list of file sizes and names, and the ClientID128
of the serverant which found the files. ClientID128 is another
16 byte GUID. However, this GUID was created once when the client as installed, is stored in the gnutella.ini, and never changes.
A Ping has no body.
Bytes 23 - 24: Host port:
The TCP port number of the listening host
Bytes 25 - 28: Host IP:
The IP addres of the listening host, in network byte order.
Bytes 29 - 32: File Count:
An integer value indicating the number of files shared by the host. No idea if this is a signed or unsigned value.
Bytes 33 - 36: Files Total Size
An integer value indicating the total size of files shared by the host, in kilobytes (KB). No idea if this is a signed or unsigned value.
Bytes 23 - 24: Minimum speed:
The minimum speed of serverants which should perform the search.
This is entered my the user in the "Minimum connection speed" edit box.
Bytes 25 +: Search query:
A NULL terminated character string wich contains the search request
Byte 23: Num Recs:
Number of Search Response Items which follow this header.
Bytes 24 - 25: Host Port:
The listening port number of the host which found the results.
Bytes 26 - 29: Host IP:
The IP address of the host which found the results. In network byte order.
Bytes 30 - 33: Host Speed:
The speed of the host which found the results. This may be incorrect. I would assume that only 2 bytes would be needed for this. The last 2 bytes may be used to indicate something else.
Bytes 34 +: List of Items:
A Search Response Item for each result found.
Last 16 bytes: Footer:
The clientID128 of the host which found the results. This value is stored in the gnutella.ini and is a GUID created with CoCreateGUID()
the first time gnutella is started.
Bytes 0 - 3: File Index:
Each file indexed on the server has an integer value associated with it. When gnutella scans the hard drive on the server a sequential number is given to each file as it is found. This is the file index.
Bytes 4 - 7: File Size:
The size of the file (in bytes).
Bytes 8 +: File Name:
The name of the file found. No path information is sent, just the file's name. The filename field is double-NULL terminated.
Bytes 23 - 38: ClientID128:
The ClientID128 GUID of the server the client wishes the push from.
Bytes 39 - 42: File Index:
Index of file requested. See query_response_rec for more info.
Bytes 43 - 46: Requester IP:
IP Address of the host requesting the push. Network byte order.
Bytes 47 - 48: Requester Port:
Port number of the host requesting the push.
An issue everyone wants to ask me about nowadays is routing. "Do I forward every packet I see to every connected host?" Holy Jesus no! That would swamp the network with duplicate packets (which it already s). Here's the secret. For simplicity sake, TTL is not discussed in this section
(Forgive the non-straight lines, but the internet's like that)
Imagine yourself as node 1 in the above diagram. You have direct gnutellanet (physical socket) connections to nodes 2, 3, 4, and 5. You have reachable hosts at nodes 6 thru 13.
Here's the basic mechanics, described in the example above:
"How many computers the packet can go through before it will stop being passed around like a whore." - Nouser (#gnutella on efnet)
TTL, anyone who knows anything about TCP/IP will tell you that TTL stands for Time To Live. Basically, when a packet (or message in our case) is sent out, it is stamped with a TTL, for each host that receives the packet, they decrement the TTL. If the TTL is zero, the packet is dropped, otherwise it is routed to the next host in the route. Gnutella TTLs work similarly. When a NEW message is sent from your host, the TTL is set to whatever you have set in your Config | TTL | My TTL setting. When the packet is received by the next host in line the TTL is decremented. Then that TTL is checked against that host's Config | TTL | Max TTL setting. The lower of the two numbers in placed in the outgoing TTL field. If the outgoing TTL is zero, the packet is dropped. [Capn's Note: I'm not positive about this next part.] Then the Hops field of the message is incremented and checked. If this number is greater than the Max TTL setting, the packet is dropped. [End Capn's Note.] This method means that even if you set your TTL to 255 (maximum value), odds are the TTL will be set to the default (5) by the next host in your chain.
This document originally written by CapnBry, bmayland@leoninedev.SPAM.com, and was downloaded from http://capnbry.dyndns.org/gnutella/protocol.php.