Entities and Encodings
HTTP ships billions of media objects of all kinds every day. Images, text, movies, software programs . . . you name it, HTTP ships it. HTTP also makes sure that its messages can be properly transported, identified, extracted, and processed. In particular, HTTP ensures that its cargo:
· Can be identified correctly (using Content-Type media formats and Content-Language headers) so browsers and other clients can process the content properly
· Can be unpacked properly (using Content-Length and Content-Encoding headers)
· Is fresh (using entity validators and cache-expiration controls)
· Meets the user's needs (based on content-negotiation Accept headers)
· Moves quickly and efficiently through the network (using range requests, delta encoding, and other data compression)
· Arrives complete and untampered with (using transfer encoding headers and Content-MD5 checksums)
To make all this happen, HTTP uses well-labeled entities to carry content.
This chapter discusses entities, their associated entity headers, and how they work to transport web cargo. We'll show how HTTP provides the essentials of content size, type, and encodings. We'll also explain some of the more complicated and powerful features of HTTP entities, including range requests, delta encoding, digests, and chunked encodings.
This chapter covers:
· The format and behavior of HTTP message entities as HTTP data containers
· How HTTP describes the size of entity bodies, and what HTTP requires in the way of sizing
· The entity headers used to describe the format, alphabet, and language of content, so clients can process it properly
· Reversible content encodings, used by senders to transform the content data format before sending to make it take up less space or be more secure
· Transfer encoding, which modifies how HTTP ships data to enhance the communication of some kinds of content, and chunked encoding, a transfer encoding that chops data into multiple pieces to deliver content of unknown length safely
· The assortment of tags, labels, times, and checksums that help clients get the latest version of requested content
· The validators that act like version numbers on content, so web applications can ensure they have fresh content, and the HTTP header fields designed to control object freshness
· Ranges, which are useful for continuing aborted downloads where they left off
· HTTP delta encoding extensions, which allow clients to request just those parts of a web page that actually have changed since a previously viewed revision
· Checksums of entity bodies, which are used to detect changes in entity content as it passes through proxies