| Web
Searching home tutorials newsletter what's new about |
Deep WebFor years people talked about the "invisible web" - that portion of content put on the Web that the search engines had not indexed. But the content is not so much invisible as inaccessible. Today the discussion is around "deep web" because the inaccessible content is stored deep in databases on the web. Simplistically put, web search engines index the "surface web" - these are the pages or content areas that are easy to find and there are no serious impediments to reading it. WebSearchGuide, as an example, is in the surface web - all flat web pages - you don't need to be a subscriber to have access, you don't need to run any queries to find this page. But the opposite is true for a vast amount of content. No one knows how much, but the figure of 90% has been quoted for many years as the amount of content that is on the web that web search engines can't or won't capture. By and large, this is content that is stored in databases that can only be extracted by logging into the service and running a query. Let's take journal articles as an example. Journals are normally stored in such databases: to retrieve the article you want you may need to run a search for on the name of the journal, or keywords in title, or a date range. A web search engine can't do this: it can't get past the login, and it would have difficulty filling in the search form to match your interest. You, the user, must have the subcription and fill in the form questions. Web search engines have some access. Google has a way to fill in simple forms and extract information it can index. Ask.com has set up routines to get answers on television schedules and sports scores. But there is much which is still very deep. Our choices are to use Federated Search engines where they exist; and to think Vertical Search and to seek out the specialty portals, sites, and databases. Using Federated SearchThere are some federated search services that will do the digging for you. Federated search is the same concept as web metasearch, but it deals with greater diversity in interfaces and databases: the query is constructed to meet the requirements of the individual services and, as with metasearch, is done simultaneously. Deep Web Technologies is a leader in this and has created federated search portals for:
Kosmix can also dig into sources to bring back a resource page of starting point on a topic. You might ask it about deep web and be pleased to read a definition from Wikpedia, some articles and videos, most of which are relevant. Added Dec 2009 DeepDyve also provides some specialty searching of science and engineering. Search is free, reading the article usually has a price. Vertical SearchOne of the chief elements to being a good web searcher is to know the best resource to use and to go directly to it. You can often find these through a good subject directory: for example, you would find MedlinePlus done by the U.S. National Library of Science by using ipl2 or the Yahoo Directory. This is an excellent resource, but there are others in the health field. There may be a vertical search or specialty search engine that will let you search several health sites at a time. Vertical search engines are created to meet our need to search within a subject domain. These focus on the particular information needs of a specific market segment. For example, there are search engines the crawl consumer health sites - they serve a particular need (health information) for a specific market segment (the consumer, and sometimes the health practitioner). Healthline is an example of a vertical that searches the best health sites on the Web and provides an enriched interface to make it easier to navigate using heath topics, and to refine the search, as well as personalizing the site to your use. It's much easier to answer a health or medical question here than through a general-purpose search engine. How do you find them?1. Subject directories are a start. There used to be meta-directories that specialized in databases, but these have been hard to maintain and have fallen on hard times. 2. Run a query at a general (horizontal) search engine, eg health vertical search or health portal. 3. Stay tuned to newsletters and services that review and comment on resources. To name a few --
|
|
The Web Search Quiz will help you review the main points in this section.