Rater-Identified Duplicates

We would like your help identifying duplicate results that have not been automatically detected. Please mark two results as dupes if they have essentially the same content on the main landing page, AND you would not want a search engine to return both results for the query.

Please note that in this project dupe identification is query-dependent.

Specific queries: For queries where the user is looking for a specific piece of content (such as queries looking for song lyrics, queries looking for a specific article, etc.), obtaining that piece of content from different sites could be helpful for users to verify the information, so they should not be rated as dupes.

Broad queries: If the query is broad, then returning the same piece of content is not what the user is looking for, and hence those results should be flagged as dupes. Results may be considered dupes even if they have different minor content on the page (such as different ads, images, or related links).

Please identify dupes both within the same side and across sides. Even for cross-side results, you should still ask yourself the question "Would you want to see both results if they were returned by the same search engine?"

Reporting Duplicate Results

When you notice that the results in two or more result blocks are duplicates, please click on the Report Dupe button of one of the results. The current result (the result you clicked on) will be highlighted by a thick solid red border and the button's name will change to Select Dupes.

You can then check results that are duplicates of the current result, and the checked results will be highlighted by a red dotted border.

The duplicate results that you checked will be annotated by "Dupe of..." text right below the sliding rating scale above the result block. After selecting all dupes, please click the red Select Dupes button to return to the normal rating mode. The button's name will change back to Report Dupe, and you will be able to report other sets of dupes (if there are any). If you change your mind, you can always un-check a result.


QUERY: [choosing and Installing a motorcycle battery] URL 1: http://www.caimag.com/wordpress/2010/03/06/motorcycle-battery-how-to-choose-install URL 2: http://www.articlesbase.com/motorcycles-articles/choosing-and-installing-a-motorcycle-battery-47798.html Reason: Both of these results display the same article (which also appears on many other pages on the web). The only real difference between the landing pages is the ads displayed around the article. The query is broad enough that users would not benefit by search engines returning more than one of these results.


QUERY: [jason castro] URL 1: http://www.myspace.com/jasoncastromusic

URL 2: http://www.myspace.com/jasoncastromusic?MyToken=503599bf-01cf-4427-bdf4-d63920c107f9

Reason: These two results have the same landing page, even though the URLs are different. Users would not benefit by search engines returning both results.

Not Dupes

QUERY: [material girl lyrics] URL 1: http://www.lyricsfreak.com/m/madonna/material+girl_20086925.html URL 2: http://www.lyrics007.com/Madonna%20Lyrics/Material%20Girl%20Lyrics.html

Reason: Even though both pages display the lyrics to the song "Material Girl", users would probably want to have the option to visit both pages so that they could verify the accuracy of the lyrics. Users could benefit by search engines returning more than one page with the lyrics to the song.

« Previous    Next »