Firefox private browsing

Firefox has added new privacy protection, anti-tracking tools.

Firefox keeps your browsing truly private with new Tracking Protection feature, PCWorld (Nov 3)

The new feature is an enhancement to Firefox’s Private Browsing mode, which deleted users’ browsing history and cookies after they closed a private window. Tracking Protection adds an extra layer of privacy to that by blocking code embedded in websites that tracks the way people behave around the Web. That means it will block a lot of ads, along with analytics tools and some social sharing buttons in order to help users keep their browsing habits more closely under wraps.

Semantic Scholar

Semantic Scholar is a new search engine that uses machine learning to extract concepts. For now its corpus has computer science papers.

Academic Search Engine Grasps for Meaning, Will Knight, MIT Technology Review (Nov 2)

Etzioni says the goal for Semantic Scholar is to go further by giving computers a much deeper understanding of new scientific publications. His team is developing algorithms that will read graphs or charts in papers and try to extract the values presented therein. “We want ultimately to be able to take an experimental paper and say, ‘Okay, do I have to read this paper, or can the computer tell me that this paper showed that this particular drug was highly efficacious?’”

Reference Management Software

If you need to manage references for your research, you’ll get good use from this review of eight reference-management tools.

Eight ways to clean a digital library, Jeffrey M Perkel, Nature (Nov 2)

This article focuses on eight tools — colwiz, EndNote, F1000Workspace, Mendeley, Papers, ReadCube, RefME and Zotero — all competing in the reference-management market (see ‘Reference-management software’ or download this Excel spreadsheet for a fuller comparison of the software ). Some excel at streamlining the process of browsing and building literature libraries, whereas others focus on creating bibliographies, aiding collaboration through the use of shared workspaces or recommending papers.

Interview with Tara Calishain

Two search experts converse: Robert Berkman at Best of the Business Web interviewed Tara Calishain, search mavin extraordinaire of ResearchBuzz. Tara comments on the state of web search, the need for a podcast search engine (yes please – or a directory), and her trusted sources.

A Conversation with Tara Calishain of ResearchBuzz (Nov 3)

This month we have chosen Tara Calishain, the creator and editor of ResearchBuzz, one of our favorite blogs for doing better online research. We chose ResearchBuzz as one of our September 2015 Best of the Business Web selections, describing it as “lively, fun and extremely informative.”

Google’s RankBrain and Entity Analysis

Kristine Schachinger at Search Engine Land tackles the relationship between Google’s RankBrain use machine learning and the entity analysis Google developed in the Knowledge Graph.

How RankBrain Changes Entity Search (Oct 29)

Key section below  – but you will really want to read the entire article yourself.

So while Google can understand known entities and relationships via data definitions, distance and machine learning, it cannot yet understand natural (human) language. It also cannot easily interpret attribute association without additional clarification when those relationships in Google’s repository are weakly correlated or nonexistent. This clarification is often a result of additional user input.

Of course, Google can learn many of these definitions and relationships over time if enough people search for a set of terms. This is where machine learning (RankBrain) comes into the mix. Instead of the user refining query sets, the machine makes a best guess based on the user’s perceived intent.

Google ranks results with RankBrain AI

AI has arrived at Google, after years of  corporate acquisitions and experimental work with machine learning. It’s called RankBrain and it does the following:

RankBrain uses artificial intelligence to embed vast amounts of written language into mathematical entities — called vectors — that the computer can understand. If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.

Already RankBrain  is the third most important ranking factor.

See Google Turning Its Lucrative Web Search Over to AI Machines by Jack Clark, Bloomberg, (Oct 26) for a video and description.

Danny Sullivan provides background and details in  FAQ: All About The New Google RankBrain Algorithm [Search Engine Land, Oct 28] t RankBrian is not a new search algorithm: it is one more component (albeit important)  the overall Hummingbird search algorithm introduced a couple of years ago.

Sullivan refers to the Bloomberg article and hazards a guess that the other two top ranking signals being used by Google are 1)  links – still, in spite of problems with these, and 2) words – ie matching on the search terms. Sullivan also mentions that Google has been expanding words for several years – word variants and related words – and that these fit into selecting and ranking results.  Google also employed entity analysis in providing answers through the Knowledge Graph.

Of interest: 15% of the 300 billion queries Google handles each day are new, and being new may lead to some adjustments to algorithms by the staff of search analysts .

Among those can be complex, multi-word queries, also called “long-tail” queries. RankBrain is designed to help better interpret those queries and effectively translate them, behind the scenes in a way, to find the best pages for the searcher.

For those wishing to know more about how RankBrain works with “word vectors”, Sullivan points to a couple of papers.

Greg Finn at Search Engine Land provides another synopsis – AI has officially made it’s way into Google’s search algorithm, here’s what you should know.

Can Bing be far behind in also employing AI for its search results?

New Lease for Wayback Machine

The Internet Archive has received funding to improve and expand the Wayback Machine for digital preservation of Web content. Thank the Laura and John Arnold Foundation for their foresight and concern.  Preservation is vital –  and everyone – governments, companies, and people should be contributing to the Internet Archive for the common good.

Grant to Develop the Next Generation Wayback Machine, Wendy Hanamura, Internet Archive Blogs (Oct 21)

The Wayback Machine, a service used by millions to access 19 years of the Web’s history, is about get an update. When completed in 2017, the next generation Wayback Machine will have more and better webpages that are easier to find. The Internet Archive, with generous support from the Laura and John Arnold Foundation (LJAF), is re-building the Wayback Machine which currently offers access to 439+ billion Web captures including Web pages, video and images.

DuckDuckGo – no ads

Not many people write about search engines anymore but when someone does it is often to announce their conversion to DuckDuckGo. And every time I say – yes, I should make switch too. This time “you can turn off the ads” caught my eye – and more and more with Google the first screen of results is made up of ads.

Why This Longtime Google Fan Now Prefers DuckDuckGo, Justin Pot, Make Use Of (Oct 18)

The more I use DuckDuckGo, the more I realize it’s like Twitter: it’s the search engine that gives me what I ask for, instead of tracking my behavior and giving me what it thinks I want. Some people might prefer the track-and-cater-to-whims approach, but I think the give-the-user-what-they-ask-for approach is better.