|
|
In February, an article in the
periodical Nature estimated that there are 800,000,000
accessible web pages (up to 40% being
duplicates). The Internet Software Consortium (www.isc.org)
reports that as of July there are 56,218,000 registered
host names (web.mit.edu is one), and Global Reach
(www.euromktg.com/globstats/)
reports that there are 204 million people who have
used the Internet. That's a lot of web sites and
a lot of people. And it’s only going to bet bigger.
Or, to quote an expert: "And, for your information,
you Lorax, I'm figgering on biggering and Biggering
and BIGGERING and BIGGERING!!''
Search engines and web directories are trying to
find useful and efficient ways to deliver all of
these resources to people, but they have a long
way to go. While they take their time, in the interim,
I'll inform you about what I know about the current
state of things.
A search engine like Altavista, (www.altavista.com)
queries a huge database of web pages each time you
click its search button. A web directory like Yahoo!
(www.yahoo.com) hosts a collection of handpicked
web sites grouped together in nice little categories.
If you are looking for general topics (e.g. Education,
Statistics), big name sites (e.g. 3Com), or just
want to browse, then your best bet is a web directory.
Otherwise, consult search engines. Yet it isn't
quite that simple. According to the Search Engine
Showdown (www.notess.com/search/),
search engines and web directories have a high proportion
(25\%+) of unique links. In other words, what you
find in one place, you won't find in another.¶ Altavista
appears to be the biggest search engine (304,066,986
pages from a search of '+*') despite independent
analysis claiming they are a close second to Fast
Search (www.alltheweb.com with 200,000,000+ pages
and counting). Then there is Northern Light (www.northernlight.com
with 150,000,000+ pages) and Google! (www.google.com
with 90,000,000+ pages). But size isn't all that
counts. Most search engines have a cryptic formula
to rank pages based on titles, text and Meta tags
(and my Dad thinks money is complicated). Nevertheless,
explaining the Web based on this analysis is insufficient
as some revolutionary ideas have just entered the
market. Google! ranks sites based upon how many
other sites link to them. In other words, if Webmasters
think a site is good enough to link to, then it
probably is. Direct Hit (www.directhit.com) ranks
sites based on user input, i.e. what people actually
click on when they search. Thus, while old engines
like Altavista only get bigger, Google potentially
gets better as Webmasters update their sites and
Direct Hit gets better as people use it.
Of course, just because you know where to search
doesn't mean you know how to search. I suggest searching
as specific as you can (narrow your search) and
also trying synonyms on separate searches. You can
also usually put a '+' before a word or phrase something
in “quotations” and the engine will require that
word to be present in all listed pages.
Finally, new ideas concerning how to bring information
to the end user are slowly appearing. Ask Jeeves!
(www.ask.com) tries to intelligently parse user
questions and www.about.com is a web directory run
by third-party individual experts. Look for more
such third-wave search engines in the future...
P.S.
www.searchterms.com reports the top ten searches
are mp3, hotmail, sex, warez, britney spears, yahoo,
snes roms, pokemon, chat and ebay; and www.searchwords.com
reports the same top ten are mp3, hotmail, warez,
n64 roms, chat, icq, xxx passwords, xxx, greeting
cards, and ebay.
Copyright © 1999 by Gabe Weinberg
Check out other writing by Gabe Weinberg at: http://www.mindspring.com/~yegg/
|