inurl: Versus site:

screenshot beginner.gif screenshot tip14.gif

Use inurl: syntax to search site subdirectories.

The site: special syntax is perfect for those situations in which you want to restrict your search to a certain domain or domain suffix like "example.com," "www.example.org," or "edu": site:edu. But it breaks down when you're trying to search for a site that exists beneath the main or default site (i.e., in a subdirectory like /~sam/album/).

For example, if you're looking for something below the main GeoCities site, you can't use site: to find all the pages in http://www.geocities.com/Heartland/Meadows/6485/; Google will return no results. Enter inurl:, a Google special syntax [Section 1.5] for specifying a string to be found in a resultant URL. That query, then, would work as expected like so:

inurl:www.geocities.com/Heartland/Meadows/6485/

While the http:// prefix in a URL is summarily ignored by Google when used with site:, search results come up short when including it in a inurl: query. Be sure to remove prefixes in any inurl: query for the best (read: any) results.

You'll see that using the inurl: query instead of the site: query has two immediate advantages:

You can use inurl: by itself without using any other query words (which you can't do with site:).
You can use it to search subdirectories.

How Many Subdomains?

You can also use inurl: in combination with the site: syntax to get information about subdomains. For example, how many subdomains does O'Reilly.com really have? You can't get that information via the query site:oracle.com, but neither can you get it just from the query inurl:"*.oracle.com" (because that query will pick up mirrors and other pages containing the string oracle.com that aren't at the O'Reilly site).

However, this query will work just fine:

site:oracle.com inurl:"*.oracle" -inurl:"www.oracle"

This query says to Google, "Look on the site O'Reilly.com with page URLs that contain the string `*.oracle' (remember the full-word wildcard? [Tip #13]) but ignore URLs with the string `www.oracle'" (because that's a subdomain you're already very familiar with).
« Previous Next »