| Previous | Next
External Unparsed Entities and NotationsNot all data is XML. There are a lot of ASCII text files in the world that don't give two cents about escaping The mechanism that XML suggests for embedding these things in your documents is the external unparsed entity. The DTD specifies a name and a URI for the entity containing the non-XML data. For example, this <!ENTITY turing_getting_off_bus SYSTEM "http://www.turing.org.uk/turing/pi1/bus.jpg" NDATA jpeg> NotationsSince the data is not in XML format, the <!NOTATION jpeg SYSTEM "image/jpeg"> Here we've used the MIME media type image/jpeg as the external identifier for the notation. However, there is absolutely no standard or even a suggestion for exactly what this identifier should be. Individual applications must define their own requirements for the contents and meaning of notations. Embedding Unparsed Entities in DocumentsThe DTD only declares the existence, location, and type of the unparsed entity. To actually include the entity in the document at one or more locations, you insert an element with an Suppose the <!ELEMENT image EMPTY> <!ATTLIST image source ENTITY #REQUIRED> Then, this <image source="turing_getting_off_bus"/> We should warn you that XML doesn't guarantee any particular behavior from an application that encounters this type of unparsed entity. It very well may not display the image to the user. Indeed, the parser may be running in an environment where there's no user to display the image to. It may not even understand that this is an image. The parser may not load or make any sort of connection with the server where the actual image resides. At most, it will tell the application on whose behalf it's parsing that there is an unparsed entity at a particular URI with a particular notation and let the application decide what, if anything, it wants to do with that information. TIP: Unparsed general entities are not the only plausible way to embed non-XML content in XML documents. In particular, a simple URL, possibly associated with an XLink, does a fine job for many purposes, just as it does in HTML (which gets along just fine without any unparsed entities). Including all the necessary information in a single empty element like Notations for Processing Instruction TargetsNotations can also be used to identify the exact target of a processing instruction. A processing instruction target must be an XML name, which means it can't be a full path like /usr/local/bin/tex. A notation can identify a short XML name like <!NOTATION tex SYSTEM "/usr/local/bin/tex"> In practice, this technique isn't much used or needed. Most applications that read XML files and pay attention to particular processing instructions simply recognize a particular target string like |