Web Search Guide banner
 

WSG Newsletter:
Will Exalead Slay Search?

Issue: February 21, 2006

[All links open in a New Window]

For all the information professionals who yearn for a search engine that can go beyond keyword searching to present alternate views of the results, Exalead is the answer, especially since Exalead One Web Search has grown to four million pages, and is expected to double again by mid 2006.

Exalead One Web Search - www.exalead.com/search

Exalead is a French company with offices in the U.S. and Italy and operations in Germany, Austria and Switzerland. Engaged in developing enterprise search products, Exalead SA launched the web search engine in October 2004 as a showcase for its technology. Since then Exalead has expanded its one-search product line to include desktop search, enterprise search, and most recently, workgroup search designed for small office environments. Common to these is the idea of unified searching whether in interface, presentation of results, or platform.

Navigation

Exalead's left panel for navigationThe technology is remarkable. François Bourdoncle, CEO of Exalead, calls it not a search engine but a “navigation engine” designed to interactively engage users and help them navigate through the results.

Navigation begins with the list of Related Terms on the left side. Exalead uses statistical linguistic analysis to extract themes from the search results. The related terms will afford an overview of the main aspects of the search topic. Selecting one term narrows the set of matching results and may lead to a second level of analysis. This has some likeness to AltaVista’s Live Topics, a project on which both Bourdoncle and Exalead’s chief technology officer, Patrice Bertin, once worked.

Related categories are the second part of the navigation scheme. Categories are extracted from the structure used by Open Directory Project. Results that Exalead has indexed from ODP will also show their category. Although there have been many complaints of deteriorating quality and service at ODP, the subject tree is still strong and its use in Exalead may help a searcher get the bigger picture.

Next, results are grouped by continent and country as determined from the URL. Searchers with a geographic interest will find this useful since it makes it easier to drill into a particular country. However, though valuable, it is not necessarily accurate. I’ve seen many misclassification – U.S. and Canadian sites listed as European, some European as U.S.

The remaining search filters are document language – quite good, and the types of documents in the result set. The latter can be extremely helpful in quickly identifying documents done in pdf format (often used in publishing articles and brochures), doc for Word documents (essays, studies, reports), swf for flash files, ppt for Powerpoint presentations as well as txt, rdf, and xls (Excel spreadsheets). It’s a nice reminder to the searcher to consider exploring a document type for possibilities.

Audio / Video

Exalead - options for audio, video, rss

Exalead also offers the option to view pages that have audio or video files. These are pages that have some of your terms along with links to multimedia. It does not mean that the audio or video file is actually on your topic, but it can lead to pleasant surprises. Click on audio and you may find podcasts where your topic is discussed; similarly video may find movies, tours, or demos. See this on a search for sustainable product design.

No search engine can ignore RSS feeds today. Exalead makes finding these easy with another button that locates pages with an associated feed. An RSS feed typically contains information about updates to a site – perhaps headlines for new postings to the blog, which can be picked up by a newsreader. Today there are a multitude of these embedded in browsers, as a plug-in to a browser, as additions to a personal Yahoo or MSN page, or easy-to-find software. A search for newsreaders in the title at Exalead will do the trick.

Syntax

But Exalead is most notable for reintroducing syntax that other engines dropped. Exalead has automatic word stemming, truncation, and a proximity operator.

Word stemming occurs on searches with two or more words (requires setting under Advanced Search). It’s not perfect, but by and large it will pick up singular and plural and some verb conjugations.

Some examples:

  • archives preservation conservation will find archives, archive and archival
  • sustainable product design – products, designer

In the second example, it does not pick up all the variations we might want for either sustainable or design. The truncation operator * looks for words with a common linguistic root. Design* brings back designs, designing, designers; and sustainab* finds sustainably, sustainability, sustainable.

But using * can sometimes seem to throw off the ranking of the results – even more reason to use the clustering provided by the related terms.

And it does not work as expected with all words. The search taxon* enterprise search, brings up fewer results than taxonomy search enterprise.

Where automatic stemming or the truncation operator fails to bring up the terms you hope for, take control by using OR

(sustainable OR sustainability OR sustainably)

Exalead recognizes the Boolean operators AND, OR, NOT, NEAR and NEXT (for words together in that order). A search query is either done with keywords or a full Boolean construction – not a mix of the two.

The NEAR operator requires that the two words be within 16 words of each other. This works very nicely to reduce results for taxonomies NEAR “enterprise search” or (sustainable OR sustainability OR sustainably) NEAR product NEAR design

Lastly there is a phonetic search but this has minimal use. It will convert dislexia to dyslexia but accepts bisness with nary a hickup. One person’s phonetic spelling is another person’s accepted misspelling. On hickup Exalead looks for that and hiccup on a phonetic search, but for bisness it sticks with the misspelling.

Exalead also supports searching the page title, site, url and filetype, and checking what pages link to a url.

The Advanced Search offers sorting by date, newest to oldest or the reverse. This is probably the date the page was last indexed. Dates are extremely bad on web pages, but sort:new will give some sense of how current Exalead is in its indexing.

Viewing Aids

There are several timesaving features for viewing the results.

Exalead - turn thumbshots on and offExalead is one of the few engines to show thumbnail shots of the page. It’s surprising how viewing these small images can influence one’s choice. At Exalead they are optional. Click on the controls in the top right.

Exalead also presents a preview. Clicking on a search result will load the live page into a small scrollable box. It shows the number of terms on the page and will highlight them. A page with 50 terms will be more relevant than one with 25.

The preview pane can be enlarged for easier use of its features: a link to open the original document; and an option to add a bookmark to an Exalead box (sets a cookie on your computer) – very handy for gathering pages to review. If the page is not in your language, there will also be a translate button.

Experiment

Exalead’s search interface is more complex than that of other search engines. Most searchers will immediately appreciate the Related Terms for refining a query. However, the other navigational and viewing aids can take time to fully comprehend and searchers will need to experiment to see how to get best effect from the ample syntax. Those who invest that extra effort will be rewarded.

Syntax

intitle:"product design" sustainable -- product design in the title, sustainable anywhere

site:gc.ca privacy -- pages at the Government of Canada that mention privacy

inurl:privacy protection -- privacy anywhere in the url and protection anywhere on the page

"privacy protection" site:ca filetype:pdf -- Documents in Canada published in pdf format that mention privacy protection.

link:www.exalead.com "search engines" -- pages that link to Exalead and mention search engines.

"web searching" NEAR (strategies OR tools) -- pages where strategies or tools is within 16 words of the phrase web searching

France's Exalead Continues Growing, Database Passes 4 Billion Page Mark by Gary Price, SEW Blog (Feb 6, 2006)

Exalead: A Potentially Powerful New Search Engine by Mary Ellen Bates, Virtual Chase (June 2005)

 

 

 


Newsletter by Gwen Harris, a convert to Exalead.


Copyright Gwen Harris
A service to subscribers of WebSearchGuide (http://www.websearchguide.ca)


Where to Next?

Return to list of newsletters.

 

home tutorials newsletter what's new about