Web Search Guide banner
 

WSG Newsletter: Altavista Update

Issue: March 4, 2002

"First you say you will and then you won't.
Then you say you do and then you don't.
You're undecided now. Well, what are you going to do?"

If ever a song applied to a search engine, this one fits Altavista. Altavista has been tweaking and fiddling with its search engine since the beginning of the year. It has a new look and a new logic but a questionable freshness and reliability.

Update: March 14, 2002 - Altavista has changed the Advanced Search to include the option to sort the results by selected words.

Update: August 29, 2002 - By sometime in June AltaVista completely converted to an AND engine to the great relief of its users.

AltaVista Search Engine

The Look

The look is now tabbed – and much better for it. It is much easier to check a search against the Web, the multimedia collection, News, and AltaVista’s Looksmart directory.

Advanced Search

But Search Assistant with the forms-based input has been merged with Advanced Search for boolean searching. The result is neither fish nor fowl. The forms-based input covers searching for all the words, any, phrase, or none, and also location but not title, not backwards link, and not related pages – all useful features at AltaVista.

Boolean

There is a box for “free-form boolean query” (use of the word free-form is an interesting touch). This is similar to the old Advanced. For a couple of weeks AV dropped the sort-by feature but restored it in mid-March. Searchers should identify the words they most want to see - these will control the sort. Use this to do several slices through the results set. (Updated March 14, 2002)

The “free-form” boolean used in AltaVista Advanced is different from the boolean in Basic. In Basic, the operators must be in Upper case – AND, OR, AND NOT, NEAR. In Advanced they can be either (though this does change).

Proximity

As well Advanced accepts some undocumented proximity operators. Greg Notess lists these in his review of AltaVista. We can ask for words or phrases to be “within” n words of each other: within 2, within 5, within 10 – any number (within 10 is the equivalent of Near). Also there is a Before (<) – one word before another – which can be combined with Near (~) like this: origin <~ “bicameral mind”. This makes AltaVista Advanced the tool of choice when we need to tighten a search by looking for words near each other.

Case Sensitivity

The two vary in case sensitivity also. In Basic, phrases are case sensitive. “Tales and Fables” will pick up less than “tales and fables”. The quotation marks can be used on a single word to pick up upper case; eg. “ASCII”. Advanced is case sensitive all the time – no quotation marks are needed to find ACSII. When capitalization is not used, AltaVista searches for both cases.

Is it All Or Any?

Notice: August 29, 2002 - AltaVista now defaults to a true AND in its Web search. The following counts no longer apply for web searching. OR still applies for Directory and Images.

Most search engines default to looking for all the words in a query. AltaVista plays at being both an AND and an OR engine. For months it would look for ALL words when there were 3 or 4 entered, and was more likely to pick up ANY words beyond that. During January 2002, AltaVista looked for ALL words regardless of number. In fact, John Ellis Sr. VP of Engineering at AltaVista confirmed this with Tara Calishain of ResearchBuzz in February (See ResearchBuzz Feb 6 –13, 2002) saying “Extensive testing on our index over the past few months has shown that ANDing of the query terms provides users with better overall results. This change is the latest step in AltaVista's continuing mission to provide users with the best search results on the web." But by March 1, 2002 it had switched back to a mix.

Part of the puzzling counts has been due to AltaVista’s facility to identify phrases.

Consider the following search statements.

Word Search
roch voisine 2658 results
"roch voisine" 2042 results. Drops with the phrase.
roch voisine concert - 2165 hits. Looks like an AND but a true AND is 398 hits
"roch voisine" concert 1805 hits. Also looks like an AND but ALL words would have been 324 hits (see below).
+"roch voisine" concert 1805 results. This requires roch voisine and makes concert optional gives same results as without the +. A true OR engine would have shown 2042 hits.
"roch voisine" +concert 2.5 million results. But put the + on concert and the results soar.
+"roch voisine" +concert 324 results. Put the + on both terms to get the AND and results drop.
Title Search
title:"roch voisine" 27 results
title:"roch voisine" concert 244,916 results. No AND here. We must add the AND ourselves.
title:"roch voisine" AND concert 6 results

The WEB Directory is a full OR.

chomsky linguistics – 7726 results. This is roughly the sum of chomsky (85) and linguistics (7674).

+chomsky linguistics – 85 results. This requires chomsky and ranks linguistics at the top. Gets the same number as chomsky alone (as it should).

+chomsky +linguistics – 33 results. This is the AND search and could be written as chomsky AND linguistics.

The Web search accepts AND, OR, AND NOT, but not NEAR.

NEWS Search is an AND

Chomsky alone finds 5 stories on March 4, 2002. Linguistics has 6. Together chomsky linguistics finds 2.

The News search ignores AND, OR, AND NOT. It will accept the minus sign (–) to exclude stories.

IMAGE Search is an OR.

Chomsky has 211 images, and linguistics has 819. Together they have 1023 results (close enough). Put a + sign in front of either chomsky or linguistics or both, and you’ll get 5 results. Whatever you do it switches to an AND. Image search will accept boolean.

What are we to make of this?

What we do to broaden or narrow a search query is influenced by whether a search engine looks for all or any words. It may be best to continue to treat AltaVista as an OR engine. If the results are very high, require the key words and concepts with the + sign (or connect the words with AND) as well as using more specific terms. If the results are too few, use fewer or more general words and/or construct some alternatives using the Boolean OR. At AltaVista, only the News search won’t recognize AND, OR.

Display

The description AltaVista displays for a page is now dynamically generated. It will use part or all of a meta description when available and will create the rest from text in the page that seems to best relate to the query.

In this search for "Margaret Atwood" poetry, AltaVista finds:

"By Brittney Goodman Margaret Atwood is my favorite author. Her works include the ... Alias Grace. The following are some links concerning Margaret Atwood and her works. The official M. Atwood ..."URL:http://www.moorhead.msus.edu/chenault/atwood.htm

The actual page has Margaret Atwood WWW Resources by Brittney Goodman. The first paragraph is:

"Margaret Atwood is my favorite author. Her works include the novels, Surfacing, The Edible Woman, The Handmaid's Tale, Lady Oracle, and The Robber Bride, and also some fine poetry and short works. I've included links concerning her latest novel, Alias Grace. The following are some links concerning Margaret Atwood and her works"

AltaVista Canada Canada

There is some good news about AltaVista Canada but not enough for Canadian searchers to rely on this search engine. Google.ca is much better.

AltaVista Canada seems to have recovered its ability to identify Canadian sites regardless of domain. It is also showing the number of results again (that had been suspended for a time).

However, it seems to have dropped many government pages. The option on the front page to search Government pages has not worked for several weeks (AltaVista has not responded to questions about this). More troubling is that a search at AltaVista World (and Canada) for host:.gc.ca (to find the number of pages indexed that are in the Government of Canada domain) produces only 46,988 pages compared to 1.8 million in November 2001. That includes French and English.

AltaVista Canada still has the old AltaVista Advanced Search where one can use sort-by to hand rank results for both Canada and World. The undocumented proximity operators work here too – within and before ( <~ to look for a word before another but within 10 words).

By my count AltaVista Canada has indexed 11.4 million pages. This seems low. When Telus was co-owner they claimed at least 14 million in 1999.

The breakdown was: .ca 8,278,968 / .com 2,171,775 / .net 401,643 / .org 562,936 / .info 6 / .biz 4

Freshness does not look good either. AltaVista Canada can’t find stories at the Globe and Mail or the National Post on Israel with a last modified date of more than February 3, 2002.

Will AltaVista Survive?

Searchers have been leaving AltaVista in droves. During January Jupiter Media Metrix watched where web searchers searched. Only 5.7% used AltaVista, whereas 24.5% went to Google, and 36.3% used MSN (since searches in the location bar of the IE browser were part of the count, searchers may not have intentionally used MSN). The report has some flaws – specifically it counts unique users rather than real traffic – but the message is clear – AltaVista is slipping. In part this is because Google really is so good, but all the gyrations of the last year, the summer months when they didn’t update the databases, the use of paid listings (Products and Services) and the practice of paid inclusion and the bad press surrounding that – all these factors surely contributed.

Today it is more a niche tool – to be used when we wish to do more complicated boolean constructions or narrow a search by using the proximity operators to find words close together. AltaVista is also the only one to really work with case well. For these reasons, as well as the News search and the multi-media collections, it has a place in our tool kit. This could change if AltaVista fails to keep its databases fresh.

MarkerReference

WebSearchGuide has a Guide to AltaVista and compares its features to other engines on the Search Engine Comparison charts. Click on Tutorials and select Research.

Other articles about change at AltaVista.

Which Search Engine is Really #1? Metrics Agencies Close in on Reality (Feb 28, 2002) By Andrew Goodman. Comments on results from Jupiter Media Metrix.

AltaVista Makes Some Changes (Feb 27, 2002) Research Buzz - news thumbnails and advanced search.

AltaVista Gets a Facelift, Again (Feb 21, 2002) Chris Sherman describes the new interface.

AltaVista offers shortcuts to the Invisible Web (Feb 11, 2002) by Chris Sherman in Search Day. Shortcuts are to quotes, weather, maps, phone directories. (This is nice but it's not a big deal.)

AltaVista Shortcuts Tap the Invisible Web to Provide Instant Answers to the Web's Top Queries. (Feb 12, 2002) AltaVista Press Release

AltaVista Software Wins CMP Media's Intelligent Enterprise Reader's Choice Award for Best Information Retrieval Product (Jan 29, 2002) AltaVista Press Release - corporate users like AltaVista for internal use.

 

 

 


Newsletter by Gwen Harris for whom AltaVista used to be a first choice.


Copyright Gwen Harris
A service to subscribers of WebSearchGuide (http://www.websearchguide.ca)


Where to Next?

Return to list of newsletters.

 

home tutorials newsletter what's new about

URL: http://www.websearchguide.ca
© Gwen Harris 2002 Revised August 29, 2002