Google Directory

screenshot beginner.gifscreenshot tip29.gif

Google has a searchable subject index in addition to its 2 billion page web search.



Google's web search indexes over 2 billion pages, which means that it isn't suitable for all searches. When you've got a search that you can't narrow down, like if you're looking for information on a person about whom you know nothing, 2 billion pages will get very frustrating very quickly.

But you don't have to limit your searches to the web search. Google also has a searchable subject index, the Google Directory, at http://directory.google.com. Instead of indexing the entirety of billions of pages, the directory describes sites instead, indexing about 1.5 million URLs. This makes it a much better search for general topics.

Does Google spend time building a searchable subject index in addition to a full-text index? No. Google bases its directory on the Open Directory Project data at http://dmoz.org/. The collection of URLs at the Open Directory Project is gathered and maintained by a group of volunteers, but Google does add some of its own Googlish magic to it.

Figure 2-1. The Google Directory
screenshot google-tips-0201.gif

As you can see, the front of the site is organized into several topics. To find what you're looking for, you can either do a keyword search, or "drill down" through the hierarchies of subjects.

Beside most of the listings, you'll see a green bar.
The green bar is an approximate indicator of the site's PageRank in the Google search engine. (Not every listing in the Google Directory has a corresponding PageRank in the Google web index.) Web sites are listed in the default order of Google PageRank, but you also have the option to list them in alphabetical order.

One thing you'll notice about the Google Directory is how the annotations and other information varies between the categories. That's because the information in the directory is maintained by a small army of volunteers (about 20,000) who are each responsible for one or more categories. For the most part, annotation is pretty good. Figure 2-1 shows the Google Directory.

Searching the Google Directory

The Google Directory does not have the various complicated special syntaxes for searching that the web search does. That's because this is a far smaller collection of URLs, ideal for more general searching. However, there are a couple of special syntaxes you should know about.


    intitle:
  • Just like the Google web

    special syntax, intitle: restricts the query word search to the title of a page.
    inurl:
  • Restricts the query word search
    to the URL of a page.

    When you're searching on Google's web index, your overwhelming concern is probably how to get your list of search results to something manageable. With that in mind, you might start by coming up with the narrowest search possible.

    That's a reasonable strategy for the web index, but because you have a narrower pool of sites in the Google Directory, you want to start more general with your Google Directory search.

    For example, say you were looking for information on author P. G. Wodehouse. A simple search on P. G. Wodehouse in Google's web index will get you over 25,000 results, possibly compelling you to immediately narrow down your search. But doing the same search in the Google Directory returns only 96 results.You might consider that a manageable number of results, or you might want to carefully start narrowing down your result further.

    The Directory is also good for searching for events. A Google web search for "Korean War" will find you literally hundreds of thousands of results, while searching the Google Directory will find you just over 1,200. This is a case where you will probably need to narrow down your search. Use general words indicating what kind of information you want - timeline, for example, or archives, or lesson plans. Don't narrow down your search with names or locations - that's not the best way to use the Google Directory.

    The Google Directory and the Google API

    Unfortunately the Google Directory is not covered by the Google API.