Understanding the Google API Response

While the Google API grants you programmatic access to the lion's share of Google's index, it doesn't provide all the functionality available through the Google.com web site's search interface.

Can Do



The Google API, in addition to simple keyword queries, supports the following special syntaxes [Section 1.5]:

site:
daterange:
intitle:
inurl: 
allintext:
allinlinks:
filetype:
info:
link:
related: 
cache: 

Can't Do

The Google API does not support these special syntaxes:

phonetutorial: 
rphonetutorial:
bphonetutorial:
stocks:

While queries of this sort provide no individual results, aggregate result data is sometimes returned and can prove rather useful. kincount.cgi [Tip #70], one of the tips in this tutorial, takes advantage of result counts returned for phonetutorial: queries.

The 10-Result Limit

While searches through the standard Google.com home page can be tuned [Tip #1] to return 10, 20, 30, 50, or 100 results per page, the Google Web API limits the number to 10 per query. This doesn't mean, mind you, that the rest are not available to you, but it takes a wee bit of creative coding entailing looping through results, 10 at a time [Tip #1].

What's in the Results

The Google API provides both aggregate and per-result data in its result set.

Aggregate data

The aggregate data, information on the query itself and on the kinds and number of results that query turned up, consists of:

    «documentFiltering»
  • A Boolean (true/false) value specifying whether or not results were filtered for very similar results or those that come from the same web host
  • «directoryCategories»
  • A list of directory categories, if any, associated with the query
  • Individual search result data

    The "guts" of a search result - the URLs, page titles, and snippets - are returned in a «resultElements» list. Each result consists of the following elements:

      «summary»
    • The Google Directory summary, if available
    • «URL»
    • The search result's URL; consistently starts with http://
    • «snippet»
    • A brief excerpt of the page with query terms highlighted in bold (HTML «b» «/b» tags)
    • «title»
    • The page title in HTML
    • «cachedSize»
    • The size in kilobytes (K) of the Google-cached version of the page, if available
    • You'll notice the conspicuous absence of PageRank [Tip #95]. Google does not make PageRank available through anything but the official Google Toolbar [Tip #24]. You can get a general idea of a page's popularity by looking over the "popularity bars" in the Google Directory.