Blekko, a modest-sized search engine with a distinctive approach to indexing curated sites, has been taken over by IBM Watson. Blekko’s volunteers identified quality sites (thereby keeping out spam) and classified them using #slashtags – and searchers used those tags for more exact searches. There was more to it – and it is likely the “more” that IBM Watson wanted for its work in cognitive computing.
“Blekko brings advanced Web-crawling, categorization and intelligent filtering technology. Its technology crawls the Web continually and gathers information from the most highly relevant and most credible Web pages. It uses classification techniques to create thousands of topical categories, making that data more useful and insightful.”
[From Data, Data Everywhere – Now a better way to understand it, Building a smarter planet, March 27]
Matt McGee at Search Engine Land gives a recap of Blekko’s short live, 2008 to the present. Goodbye Blekko: Search Engine Joins IBM’s Watson Team
Google had developed the means to identify and employ entity analysis at least as far back as 2013 as this posting by Bill Slawski shows. The purpose of the patent he describes is to “to provide a factual response to a query showing different aspects related to a ‘single conceptual entity.'”
Google’s Knowledge Cards by Bill Slawski, SEO by the SEA (Mar 18)
Knowledge cards assemble “name, description, image, facts and related searches.”
Fascinating examination of a patent by Google to determine facts about some topic from patterns on Web pages.
Google On Crawling The Web Of Data by Bill Slawski, SEO by the Sea (Feb 22)
This type of pattern-matching and extraction of facts is part of how Google uses the Web as a database of information. By extracting facts and storing them in a data repository, like Google’s knowledge graph, it makes those facts available as direct answers.
DARPA, in the US Department of Defense, has launched a new search engine named Memex, which is intended to expose the “dark” web of hidden content.
Darpa Is Developing a Search Engine for the Dark Web by Kim Zetter, Wired (Feb 10)
The search engine was described and demoed on 60 Minutes. Only 5 minutes, you can view it from CBS – DARPA: Nobody’s safe on the Internet. Mind – the objective is to help law enforcement track down crime – and to do so through data mining.
Developers haven’t given up on data visualization for the search interface. Etsimo in Finland is a new contender with its SciNet interface. It is available as a demo. This Techcrunch has a short video.
This Search Engine Wants More Human Input, TechCrunch (Feb 2)
The SciNet approach to the increasingly hard problem of effective search is to involve the human user more by having them steer the algorithmic results — by signaling multiple intents as the process progresses. This generates a dynamic and visible spectrum of results — depending on what they are looking for, or interested in — and allows them to selectively drill down into complex queries in an informed, and self-guided way. The basic idea being that human-steered results are better than algorithms alone.
The visual interface never seems to grab users. Will be interesting to see if this company succeeds.
What I have learned from this posting from SEO by the Sea — Google’s Query Language (Nov 13)
– Google has patents on something it calls a “browseable fact repository” – this became the Knowledge Graph.
– Google engineers considered a query language for this. It would have had to be something like SPARQL for searching a relational database. Bill Slawski describes the key bits, which we can be pretty sure almost no one would ever learn.
– Google decided on a “united search interface” instead – understandable.
– Handy reference - Punctuation, symbols and operators in search.
This article provides a good overview with illustrations of how search works. The classic types of search – navigational, informational, and transactional are noted, and also that Google addresses all of these in “using semantic and exploratory techniques to information retrieval”. Search has changed greatly over the past 5 to 10 years to become much more personalized, more monitored, and more commercialized – all concerns that are explored here.
There is more to search than Google by Royan Ayyar, Semrush (Oct 24)
Entities is the key to search-engine placement – not keyword trigger words – but content that names people, places, events, and provides other important information. Bill Slawski gives the example of building a page on Black History that will rank well based entirely on meaningful content.
How I Came To Love Entities by Bill Slawski, SEO by the Sea (Oct 24)
The best designed sites provide clear navigation structure (such as a taxonomy or table of contents) to direct users to content. That structure, as we learn in this article, informs and guides users, in ways that keyword search doesn’t. Keyword search requires that the user have knowledge and a specific skill.
Search Is Not Enough: Synergy Between Navigation and Search, by RALUCA BUDIU, Nielson Norman Group (September 7, 2014)
Navigation serves important functions: it shows people what they can find on the site, and teaches them about the structure of the search space. Using the navigation categories is often faster and easier for users than generating a good search query. Plus, many times site search does not work well or requires users to have a good understanding of its limitations.
Search engines today – especially Google and Bing – seek to identify entities and their relationships. This posting distinguishes between implicit and explicit entities. Explicit is known from structured markup; implicit is inferred from the text on the page.
Demystifying The Knowledge Graph, Barbara Starr, Search Engine Land (Sep 2)
Posting has advice for SEO people for optimizing their pages for recognition by the Knowledge Graph.