| Web
Searching home tutorials newsletter what's new about |

Search engines let you search for the occurrence of particular words on Web pages. Robots or spiders roam the Web, moving from site to site collecting the titles and contents of the Web pages. These words are stored in databases, along with the addresses to the sites containing those words. The researcher may then search the database to find pages of interest.
Search engines do vary in several aspects:
The amount of text they index. Most claim to index every word on a page, although they may skip very common words like "the" or "in" (called stop words). Some make use of meta-tags by which the page author can describe the page and specify keywords.
Number of pages indexed. Google said in the summer of 2008 that it knew of one trillion urls. This doesn't mean it has indexed them all but it can reach them. Exalead, which is likely the smallest web search engine, reports 8 billion (February 2009).
The frequency with which sites are re-visited and re-indexed. This is hard to determine but generally speaking search engines refresh some sites every day and the rest up to siix weeks. Some engines seem to have fresher results than others. The majors - Google, Yahoo, and Bing - compete to include real-time results - items that were posted moments before in Twitter and other social media sources.
The search syntax they use. The basics are the same at most search engines today. All major search engines search for ALL your words automatically. You can ask for the words to occur together by using quotation marks "like this". You can also exclude pages with a certain word by using the minus sign. However, there may be differences in restricting searches to the title of the document or the site domain.
The shortcuts they offer. Google, Ask, Bing, AOL and Yahoo offer shortcut commands for quick lookup of definitions, maps, and other day-to-day reference needs.
How they determine the relevancy of results. All use some mix of occurrence of words together, presence in the title, number of times a word is used, proximity and order of words. Many search engines have followed Google's lead to consider how well known the site is - who links to it, and how popular it is - who uses it. Ask.com also considers popularity - how often people looked at that result for that query. Some, especially Google, will personalize results according to your search history and preferences.
The presentation of results. The standard is to snip phrases from the page where the words occur. Google initiated this and Yahoo and most others do the same. Sometimes, however, you'll see the first two or three sentences from the page instead of a snippet. Search engines are also enhancing results with images or links to sections of the site.
A search engine is very easy to use. Just think of words to represent your question and enter them.
It is very good at finding specific topics or newly named concepts. For example, when the concept of the viral marketing was new the best way to find information about it was through a search engine - subject directories did not yet have a category set up.
Search engines are not the same. You have to invest time in learning their search rules, and recognizing their particular strengths and weaknesses.
You get what you asked for.
. The search terms you have chosen might not be the
ones used by the creators of the Web pages. You might be looking for workflow
re-engineering but the authors have called it business process redesign.
. There may be synonyms that you have not considered.
A search may bring up hundreds or thousands of hits. People are often overwhelmed by the count, and most people only look at the first 5 to 10 results. Don't stop there - scan at least the first 30 to 40.
Search results might not be relevant:
. Your search terms might be on a page but not in
the form you were expecting. In the case of digital cameras a simple
search would pick up all instances where digital and cameras
occurred somewhere on the page. Most of the hits would have nothing to do
with the topic. [TIP - search for this as a
phrase by putting quotation marks around it "digital cameras".]
. Results can be irrelevant because the search term
has many meanings. A bridge might cross a river or be a card game or a type
of financing or a dental fixture. [TIP - always
add context words such as bridge card game rules ]
. There may be alternate spellings of words - labour
and labor, misspellings, acronyms and short forms. Search engines are smarter
at figuring out these relationships but you need to watch too.
Search results may be heavy in advertisements and paid-placement listings. Watch for sections labelled as "sponsored" or "featured" - they will be the commercial paid-for listings.
No search engine has indexed everything on the Web. The one you are using
might not have indexed the pages that hold the information you are seeking.
Do some hands-on with these great search engines. Start with Google.