The Internet Archive hopes to create another copy of the archive to be stored in Canada – because redundancy will protect against loss. Good idea. They need money to do this. Donations are tax deductible but I presume that is for residents of the United States – not Canada. Certainly it’s in our interest since there is a great deal of Canadian materials from websites and digitization projects stored in the archive. For example, view this page listing Canadian Libraries and the number of items digitized.
Help Us Keep the Archive Free, Accessible, and Reader Private, Brewster Kahle, Internet Archive Blog (Nov 29)
The comments are interesting although not consistently supportive or elevating. Many headlines attribute this decision as protection against Trump or the Trump administration. CNET said it straight out: Trump inspires Internet Archive to build replica in Canada
Several Canadians responded – attracted to the posting by Canada in the title. They seem keen to donate but note absence of charitable status. A couple of American writers regard Canada with suspicion because Canada restricts freedom of speech through its laws against hate speech and because Prime Minister Trudeau spoke favourably about Fidel Castro.
One interesting point to note is that the owner of a domain can “use their robots.txt file to remove current AND past archives” from the WayBack Machine.
All in all the Internet Archive is an extremely valuable resource to Canadians especially for historical research – we do need to help keep it safe from whatever disaster could befall it. It’s just good disaster planning.
Digital Public Library of America becomes even richer resource for researchers thanks to the newly inked agreement by the Library of Congress to be a “content hub partner”. This could make DPLA everyone’s first stop for finding materials on US cultural history.
Library of Congress, Digital Public Library of America To Form New Collaboration, Dick Eastman, Eastman’s Online Genealogy Newsletter (Nov 29)
Quoted – “The Digital Public Library of America is a portal — effectively, a searchable catalog—that aggregates existing digitized content from major sources such as libraries, archives, museums and cultural institutions. It provides users with links back to the original content-provider site where the material can be viewed, read or, in some cases, downloaded.”
Nearly half the world will be online by year’s end; poorer countries will lag: report, Noobile Dluda, Reuters via Globe and Mail (Nov 22)
Only half? And most of these are in developed countries. The adoption – or distribution – of the Internet and its capabilities is slowing.
In the world’s developed countries about 80 per cent of the population use the internet. But only about 40 per cent in developing countries and less than 15 per cent in less-developed countries are online, according to a report by the U.N.’s International Telecommunications Union (ITU). …
Globally, 47 per cent of the world’s population is online, still far short of a U.N. target of 60 per cent by 2020. Some 3.9 billion people, more than half the world’s population, are not. ITU expects 3.5 billion people to have access by the end of this year.
I’ve just had to deal with a bout of Cryptowall ransomware – it’s not pretty. So any headline with the word ransomware catches my attention. This was in the Nov 27 ResearchBuzz – many thanks.
Ransomware is spreading via images on Web sites. “‘Locky’ ransomware was first discovered earlier this year. As the name implies, it locks up a victim’s computer by encrypting their files and demanding a ransom of .5 bitcoins (about $365) in exchange for a key. Earlier this week, Hacker News reported that a Facebook spam campaign was spreading Locky through image files in the SVG format. At the time, Facebook denied that this was happening. Now, security firm Check Point says that Locky is being embedded into several graphic formats and spread through ‘social media applications such as Facebook and LinkedIn.’”
Google Scholar has competition in two AI-based scholarly search engines: the new Semantic Scholar strong in the sciences, and the relaunched Microsoft Academic with content from many fields of study.
AI science search engines expand their reach “Semantic Scholar triples in size and Microsoft Academic’s relaunch impresses researchers”, by Nicola Jones, Nature (Nov 11)
[Semantic Scholar] A free AI-based scholarly search engine that aims to outdo Google Scholar is expanding its corpus of papers to cover some 10 million research articles in computer science and neuroscience, its creators announced on 11 November. Since its launch last year, it has been joined by several other AI-based academic search engines, most notably a relaunched effort from computing giant Microsoft.
The invaluable Internet Archive has added two search features: faceted filtering – media type, topics and subjects; and full text searching across 9 million text items (but in beta).
Searching Through Everything, Internet Archive blog post (Oct 26)
For years, Google would not reveal how many pages it reached. Then in 2013 Google revealed 30 trillian, today it’s 130 trillion pages: This is reach – ie pages Google knows about – they aren’t indexed in the database.
Google’s search knows about over 130 trillion pages Barry Schwartz, Search Engine Land (Nov 14)
Students and researchers can learn more about scholarly databases through Beyond Citation. Started as a project through CUNY Graduate Center Digital Praxis Seminar in 2014, its objective is to “aggregate information about academic databases to encourage critical thinking about how these resources affect scholarship”. In other words, it’s important to be aware of the limitations of Google Books, and of the other twelve databases reviewed. At the very least this is a good starting point for learning of the existence and qualities of these databases. But updates and additions to the site seem to have stopped in August 2015. The Twitter feed for @beyondcitation however is alive and active.
Dan Russell at SearchResearch had a series of challenges on finding emoji and unicode characters – what they are, how to do it, and why you would want to.
- The background – #900 – A note about how to search for emojis and other Unicode characters
- The challenge – Finding interesting uses for unicode/emoji search?
- The answer – Answer: Finding interesting uses for emoji/Unicode search?
Genealogy researchers need to use advanced search techniques. Blogging gurus in this field sometimes offer guidance in using search operators. Dick Eastman has posted (+) Boolean Basics – Part #2 in which he shows the use of the NOT operator ( – sign) to exclude, and quotation marks to look for phrases. He illustrates with several well developed queries showing OR and AND (although Google defaults to ANDing terms).
Please note: The use of + to require a term no longer applies – does not work – can accomplish the same thing by putting the word inside quotation marks, otherwise use Verbatim found under Search Tools > All Results.
As well, there is the number range operator which is extremely useful when looking for events between dates. Format is nn..nn. Example: immigration “canada west” 1843..1853″ – to get pages that talk about immigration to Canada West and have a date within the range of 1843 to 1853. Since Upper Canada was also still being used in reference, could form the query as immigration (“canada west” OR “upper canada”) 1843..1853″