Looking for Technical Signals


When evaluating a page for spam, you should start by looking for the following “technical signals.”

• Hidden text and hidden links.

• Keyword stuffing.

• Sneaky redirects.

• Cloaking with JavaScript redirects and 100% frames.

This section describes these technical signals and provides tips and tools on how to identify them.

Hidden Text and Hidden Links
Webmasters add hidden text and/or hidden links to lure search engines and users to their pages. Hidden text is visible to the search engine, but not to the user who might find it distracting or annoying. Here are some things you should know about hidden text:

• It may be completely invisible to the human eye.

• It may be in the same color as the background color on the page, or in a color that is so close to the background color that it almost invisible and will not be noticed.

• It may be formatted in a very, very small font size (e.g., 1-point) so that it will not be noticed.

• It may be placed outside the normal viewing area. For example, there may be a large blank space between the normal viewing area and a “hidden” area of text all the way at the bottom of the page or far to the right.

• Sometimes there is just a line or two of hidden text, but you may even see a whole page of it.

• Most hidden text is there to trick the search engine, but occasionally you will find hidden text that is not spam.

For example, if the webmaster merely hides the date of an update, it is not spam.

Hidden text may be revealed by:

• Applying Ctrl-A (or "" and "A" for Apple computer users).

• Looking outside the normal viewing area.

• Using advanced techniques: disabling CSS, disabling JavaScript, viewing the source code.

Apply Ctrl-A to the Landing Page

After you have clicked on the URL, simultaneously press the “Ctrl” and “A”
keys (the keyboard shortcut for “Select All”

for PC users), or "

"
and
"A" or "Command" and "A" (the keyboard shortcuts for Apple computer users) and then

scroll down the whole page. This technique sometimes reveals text that has been hidden.


Example of hidden text revealed by applying Ctrl-A:


LP before applying Ctrl
-A LP after applying Ctrl-A


Tiny text is not always exposed using Ctrl-A. You should be suspicious of horizontal lines or bars on the LP because sometimes they contain hidden text. A simple technique for revealing this type of hidden text is to select and copy the suspicious line or bar, paste it in your word processor, and increase the font size. You may also try using the techniques described below.

Look Outside the Normal Viewing Area
Be suspicious of large blank areas on the bottom and far right portions of the page. Use the vertical and horizontal scroll bars to see if it appears there is text on the portion(s) of the page outside the main viewing area.

Advanced Techniques

Use these techniques when the page is suspect and you want to dig deeper.

Disabling CSS: Disabling CSS sometimes reveals hidden text. Here are instructions for disabling CSS using the Web Developer toolbar (a Firefox add-on that can be helpful detecting spam).

• Click on “CSS.”

• On the dropdown menu, click on “Disable Styles.”

• Click on “All Styles.”
You do not need to check every page for hidden text in CSS, but please do check if the page is suspect.


Example of hidden text revealed by disabling CSS:

LP before disabling CSS

Disabling CSS

LP after disabling CSS


Disabling JavaScript: Spammers sometimes use JavaScript to hide text. Here are instructions for disabling JavaScript using the Web Developer toolbar:

• Click on “Disable.”

• On the dropdown menu, click on “Disable JavaScript.”

• Click on “All JavaScript.”

• Refresh the page.

You can also disable JavaScript using your browser menu. For example, this is how you would do it in Firefox.

Disabling JavaScript using your browser window in Firefox:

• Go to “Tools.

• Click on “Options.

• Click on “Content” or “Web Features.

• To disable JavaScript, uncheck the “Enable JavaScript” box .

• Click “OK.

Example of hidden text revealed by disabling JavaScript:

LP before disabling JavaScript

Disabling JavaScript

LP after disabling JavaScript


Important: When you are done looking for spam on a particular page, please remember to go back and enable JavaScript. If you do not do this, certain features on pages you open will not work.

Viewing Source Code: Viewing the source code sometimes reveals hidden text. This is how you would do it in Firefox.


Viewing Source Code in Firefox:

• Go to “View.

• Click on “Page Source.” or

• Right click on the page.

• Click on “View Page Source.

Look for large areas of keyword stuffing in the source code. Keyword stuffing is discussed in Section 12.3.5.


Example of hidden text revealed by viewing the source code:

Landing page

Viewing source code

Source code of the LP


Please note that a page should not be considered spam when the keyword stuffing appears in the meta tags only. Meta tags are easy to identify because they start with the words "meta name.”


Example of keyword stuffing that appears in the meta tags only and is not considered to be spam:


Landing page
Source code with keyword stuffing in the meta tags only


Keyword Stuffing

Keyword Stuffing: Webmasters sometimes load pages with an excessive amount of keywords. Here are descriptions of what you might see:

• Keywords repeated many times on the page

• Words that are related to keywords repeated many times on the page

• Multiple misspellings of keywords on the page

• Pages with a large amount of what look like gibberish or random keywords.

• Pages that appear to be programmatically or automatically generated text that doesn’t really make sense.

Webmasters also sometimes load pages with irrelevant keywords on topics that are unrelated to the query, such as mortgages, cell phones, ringtones, gambling, weather, etc. Whether the keywords are related or unrelated to the query, the intent is to draw search engines and users to the page.

It is sometimes difficult to decide when the keywords on a page should be considered keyword stuffing. We ask you to consider a page to be spam if you think the number of keywords on the page is excessive and would be annoying and distracting to the real user.

Please note: Hidden text and keyword stuffing often go together. Hidden text frequently contains keyword stuffing.

Recognizing keyword stuffing


Some keyword stuffing is visible to the human eye and you will not have to use any special techniques to see it. In other cases, it is hidden. You will discover hidden keyword stuffing by using the techniques already described. Important: Hidden keyword stuffing will always be considered spam (unless it is only in the source code meta tags).

Here are some examples that most users would consider excessive and annoying, even though in some cases the keywords are in the portion of the page “below the fold,” which users would have to scroll down to see:


Fake Feed with Keyword Stuffing Examples: The keyword stuffing on this page is just a collection of links with nonsensical link names. Notice how many times the words “las vegas” and “casino” appear on the page.


Fake Blog with Keyword Stuffing Example: Notice how many times the word “Imodium” appears in this fake blog entry


Computer-Generated Page with Keyword Stuffing Example: Notice all the spelling variations of “Nissan” and the computer-generated gibberish text.

Keyword Stuffing in the URL

URLs may also contain keyword stuffing. These URLs are computer-generated based on the words in the query and are often formatted with many hyphens (dashes) in them. They are a strong spam signal.


Keyword Stuffing in the URL Example


Here are some additional examples of URLs with keyword stuffing. We have removed the hyperlinks from these examples because they have stopped working and/or become malicious. You do not need to click through to the landing page in order to see that there is keyword stuffing in the URL and that they are spam.

• http://frat-boy-blog-gay.grandbrooklynlodge.cn/boy-brief-frat-in-their-wet.html

• http://brazilian-model-alexandra.wantloweryour.cn/brazilian-model-adriana-lima.html u=jkohjil

Sneaky Redirects
Sneaky Redirects: We call it a sneaky redirect when a page redirects the user from a URL on one domain to a different URL on a different domain, with spam intent. Search engines “see” the first page, while the user is sent to a different page and sees different content. Here are some other things you should know about sneaky redirects:

• While being redirected, you may notice that the page redirects through several URLs before ending up on the landing page.

• Sneaky redirects may take the user to one of several rotating domains; so clicking on the same URL several times may send you to different landing pages each time.

• Some sneaky redirects take users to well-known merchant websites, such as Amazon, eBay, Zappos, etc.

Recognizing sneaky redirects

Compare the two URLS: Compare the URL in the rating task to the URL of the landing page to see if it makes sense that one would redirect to the other. A redirect from a company’s old homepage to its new homepage on a different domain is not sneaky. Redirects from one page on a domain to another page on the same domain are also not sneaky.

Look at the domain registrants: If you suspect that a sneaky redirect has taken place, you should check to see “whois” the registrant (or owner) of the two domains. If the registrant is the same, the redirect is less likely to be sneaky.


Using “Who Is”
Here are instructions for checking “whois” the domain registrant:

• Go to the site of a “whois” provider. Here are two you can use: http://www.domaintools.com
and
http://whois.mtgsy.net/default.php
. Some computers also allow you to run a command like whois example.com from a terminal window.

• Enter the URL of one domain in the search box on the “whois” page. Sometimes, you will need to delete some leading or following characters. For example, if the URL is http://supportapj.dell.com/support, you will enter just “dell.com” in the search box of the whois provider.

• Open another “whois” page.

• Enter the URL of the other domain in the search box on the second “whois” page.

• Compare the domain registrants for the two URLs. If you find that they have the same domain registrant, you will *typically* conclude that the page is not spam. If they are different and do not seem related, it is probably spam.

Sneaky Redirect Example

URL before clicking (task URL)

URL after clicking (URL of the landing page)


Using a “whois” provider to learn about the domain registrants

Entering the domain of the task URL

Domain registrant information of the task URL

Entering the domain of the LP URL

Domain registrant of the LP URL


The domain registrants in this example are different. Since these two domain registrants have no relationship, this is a sneaky redirect.

Non-Sneaky Redirect Example


Trans World Airlines (TWA) was acquired in by American Airlines in 2001. Click here to read about this acquisition. The URL of the TWA homepage was twa.com. The URL of the American Airlines homepage is aa.com.


The
domain registrant of twa.com
and the
domain registrant of aa.com are the same, so this is not a sneaky redirect.

Please be aware that domain names with the same domain registrant can look very different. For example, Barnes and Noble, the bookseller, owns the following domains: www.barnesandnoble.com
,

www.bn.com
,
and
www.books.com
.

Cloaking
It is called “cloaking” when the webmaster shows different pages to the search engine and the user. True cloaking is somewhat rare, but spammers do use other methods to show different pages to search engines than to users. Two such techniques used by spammers are:

JavaScript redirects

100% frame
Spammers use JavaScript redirects to show one page to search engines while sending users to a different page. Looking at the page first with JavaScript enabled and then with JavaScript disabled reveals the differences.
Webmasters sometimes cloak what users see by using frames. Two frames (pages) exist, but one frame takes up 100% of the screen. The user sees one frame (page), but the search engine sees both frames. Here are instructions for looking at the different frames in Firefox:


Viewing Frame Information in Firefox

• Right-click on the page.

• Click “This Frame.

• Click “View Frame Info.

• Compare the URL of the frame with the URL of the page. If they are different, the page is probably 100% framed, and should be considered spam.

100% Frame Example

URL of the landing page

Viewing frame information

Comparing the URLs (of the landing page and the frame)

Domain registrant of the landing page (neoobe.com in Westchester, California)

Domain registrant of the frame (animaturk.com in Istanbul, Turkey)


Since the URLs and the domain registrants are different, the page is probably 100% framed.


« Previous    Next »