Latest Web Technology News and Web Technologies August 15 2005, the latest breaking New York Web design news brought to you by,
Web Designs Now,Website Designs Now,New York Web Design Homepage,Web Design Services for New York, Connecticut, Long Island,New York Web Design Client Testimonials,Website Portfolio of New York Web Design, About this New York Web Design Firm,Contact this New York Web Design Firm

How Big is the Web?
Latest Web Technology & Web Design News, August 15, 2005

Web Map Tracks News Demand
Miva Settles Web PPC Suit
How Big is the Web?
Web Browser Market: Firefox vs IE
New Ways to Promote Web Fares

Excite@Home's Web Domain for Sale
Dogpile Adds MSN Web Search
IE7 Web Browser Not W3C Compliant
Web Browser Hack for MSFT ATV
Web Video in Flash 8

More Web Design News:
2011 Latest Web Technology News
2011 April
2011 March
2010 December
2009 April
2008 November
2008 October
2008 July
2008 June
2007 June
2007 May
2007 March
2006 November
2006 September
2006 August
2006 July
2006 June
2006 May
2006 April
2006 March
2006 February
2006 January
2005 December
2005 November
2005 October
2005 September
2005 August
2005 July
2005 June
2005 May
2005 April
2005 March
2005 February
2004 March
2004 February
2004 January
2003 December
2003 November
2003 October
2003 September
2003 August
2003 July
2003 June
2003 March - May



August 15, 2005

SAN FRANCISCO -- How big is the World Wide Web? Many Internet engineers consider that query one of those imponderable philosophical questions, like how many angels can dance on the head of a pin.
By John Markoff

But the question about the size of the Web came under intense debate last week after Yahoo announced at an Internet search engine conference in Santa Clara, California, that its search engine index -- an accounting of the number of documents that can be located from its databases -- had reached 19.2 billion.

Because the number was more than twice as large as the number of documents (8.1 billion) currently reported by Google, Yahoo's fierce competitor and Silicon Valley neighbor, the announcement -- actually a brief mention in a Yahoo company Web log -- set off a spat. Google questioned the way its rival was counting its numbers.

Sergey Brin, Google's co-founder, suggested that the Yahoo index was inflated with duplicate entries in such a way as to cut its effectiveness despite its large size.

"The comprehensiveness of any search engine should be measured by real Web pages that can be returned in response to real search queries and verified to be unique," he said on Friday. "We report the total index size of Google based on this approach."

But Yahoo executives stood by their earlier statement. "The number of documents in our index is accurate," Jeff Weiner, senior vice president of Yahoo's search and marketplace group, said on Saturday. "We're proud of the accomplishments of our search engineers and scientists and look forward to continuing to satisfy our users by delivering the world's highest-quality search experience."

The scope of Internet search engines, and thus indirectly the size of the Internet, has long been a lively area of computer science research and debate.

Moreover, all camps in the discussion are quick to note that index size is only loosely -- and possibly even somewhat inversely -- related to the quality of results returned.

The major commercial search engines use software programs known as Web crawlers to scour the Internet systematically for documents and index them.

The indexes themselves are maintained as arcane structures of computer data that permit the search engines to return lists of hundreds of answers in fractions of a second when Web users enter terms like "Britney Spears" or "Iraq and weapons of mass destruction."

On Sunday, researchers at the National Center for Supercomputer Applications attempted to shed light on the debate by performing a large number of random searches on both indices. They ran a random sample of 10,012 queries and concluded that Google, on average, returned 166.9% more results than Yahoo. In only 3% of the cases did the Yahoo searches return more queries than Google. The group said the Yahoo index claim was suspicious.

Neither Yahoo nor Google makes public the software algorithms that underlie their collection methods. In fact, those details are closely guarded secrets, which lie near the heart of heated competition now going on between Google, Yahoo and Microsoft over who can provide the most relevant answers to a user's query.

"It's a little bit silly," said Christopher Manning, a Stanford University professor who teaches a course on information retrieval. "It's difficult, and the whole question of how big indexes are has clearly become extremely political and commercial."

Even if the methodology is unclear, there is no shortage of outside speculation about what the different numbers mean, if anything.

Jean Veronis, a linguist in France and director of the Centre Informatique pour les Lettres et Sciences Humaines, posted a discussion on a blog noting that the increase in Yahoo references in French appeared consistent with the larger overall number that Yahoo was now reporting.

He added a caveat, however. "All of this should of course be taken with a large pinch of salt," he wrote. "So far, I haven't quite caught Yahoo red-handed when it comes to fiddling the books, but this could simply be because they are smarter with their figures than their competitors ;-)"

In contrast, a fellow blogger, Akash Jain, did his own random query test and wrote that it appeared that Google's index remained about 50% larger.

Other search engine specialists remained skeptical about the ability to estimate Web or index size as long as the search engines were being secretive about their methods. "I don't have any good way of checking," said Raul Valdes-Perez, a computer scientist who is chief executive of Vivisimo, which operates the Clusty search engine. "It feels a little like Harvard and Yale decided to argue over who has the most books in their respective libraries."

Web Designs Now
Back to the Top


 © Copyright 2011, All rights reserved  |  Privacy Web Design Forums  |  Web Design News  |  Advertise  |  About Us  |  Contact Us  |  W3C HTML 
 Related Websites: New-York-WebDesign.com