Tremendous article about archiving efforts to preserve Web content in the New Yorker by Jill Lepore – The Cobweb: Can the Internet be archived? .
First point – “The average life of a Web page is about a hundred days. ” Pages disappear for many reasons: sites die with their hosts – MySpace as an example; organization purposely deletes the pages – as British Conservative Party did with 10 years of speeches; website is reconfigured and content isn’t moved or is impossible to find. This is a plague of link rot for footnotes.
Internet Archive is the largest program to save Web content – archive.org. It has captured 425 billion pages. There are associated services to help it – Archive It, and “Save Page Now” at archive.org. Also a new Perma.cc to be used to create permalinks for articles referenced in footnotes.
There are other initiatives – Europeana as a digital library in Europe, and Digital Public Library of America.
Biggest issue is copyright – and the right to save.
Internet Archive blogged about this article also – The New Yorker: The Cobweb–Can the Internet be archived?. We hope they archived it.
Article does not mention the work Internet Archive has done to save other media and digital books.
Which of these do you prefer – Wikipedia or Encyclopedia Britannica? I use both, though it depends on how much Britannica will reveal. This article asks — Which Has More Bias? Wikipedia or the Encyclopædia Britannica — by Feng Zhu, Working Knowledge at the Harvard Business School. (Jan 19)
It’s not easily untangled. They both show bias, although articles in Wikipedia, if edited by enough people, become more balanced or unbiased. Conclusion seems to be – be aware of bias, and use both.
University of Toronto Libraries has been archiving web content in four areas through its relations with the Internet Archive and its service for capturing pages, Archive-It.
These are available at Archiving-It.
- Canadian Government Information
- Canadian Labour Unions
- Canadian Political Parties and Political Interest Groups
- University of Toronto Web Archives
The collections are searchable and one can refine by format.
To see the list of sites included, enter the collection and click on the collection name. There are excellent filters for narrowing the search: subject, creator, year, language.
University of Toronto – Archive – Canadian Government Information
Archive material from the London Illustrated News and the “Great Eight illustrated magazines” 1914 to 1919 will be available at the Illustrated First World War site.
Browse the wartime pages of the Illustrated London News in a new online archive, First World War Centenary (Aug 13)
The project means that for the first time in 100 years, the public will be able to browse the wartime pages of The Illustrated London News and its sister titles; discover paintings, illustrations and sketches by war artists; and read articles, many of which have not been seen since they were first published.
- timeline to the war
- ILN articles – that seem to be timed to the current week 100 years ago
- War artists who were illustrators.
- A blog
Here’s another resource of war records for the researcher into World War One – the names and information about people held in the prisoner-of-war camps.
New, Free Website Has Millions of World War I Prisoner of War Records, Genealogy Insider (Aug )
Records were collected by and are made available through the International Committee of the Red Cross — http://grandeguerre.icrc.org/
Records include the ledger entries for prisoners, some postcards or pictures of camps, and a few personal accounts.
Must have been a mammoth job to digitize it all – see the video at http://grandeguerre.icrc.org/en/MakingOf
The Internet Archive is best known for the Wayback Machine to archived web pages but it has much more – books, images, music, and specialized collections.
5 Types Of Free Content Riches You Can Dig Up At The Internet Archive by Jessica Coccimiglio, Make Use Of (Jul 16)
Canadians will be interested in the long list of texts and collections from Canadian schools and associations — Canadian Libraries
Those seeking grey literature or scholarly will want to explore the Digital Commons Network of free, full-text scholarly works. These are sourced from 330 universities and colleges worldwide (although most are in the United States) and curated by the university librarians. Among Canadian universities I noted McMaster University, Wilfrid Laurier, University of Windsor, University of Western Ontario, and Osgoode Hall Law School of York University. There are surely more.
Digital Commons holds “peer-reviewed journal articles, book chapters, dissertations, working papers, conference proceedings, and other original scholarly work.”
The site opens to a multicoloured wheel for visually exploring the disciplines. Continue browsing by spinning the wheel. You may also click on an academic discipline and begin to narrow by journal, author, or keyword search.
Search wheel for exploring the Digital Commons Network
The collection is made available through Berkeley Electronic Press (bepress.com).
This stupendous resource was featured in the Digital Shift – Uncommonly Open: The New Digital Commons Network (June 19, 2013)
This resource was reviewed in the BestBizWeb newsletter.
Important facts about the Wayback Machine from the Internet Archive.
Wayback Machine Adds 160 Billion Indexed Pages In A Year, Surpasses 400 Billion Indexed Pages, Barry Schwartz, Search Engine Land (May 12)
- It has over 4oo billion indexed pages since 1996
- It added 160 billion pages in about 14 months (Jan 2013 to May 2014)
- In October 2013 it added capability to quickly view new content.
- Individuals can also save specific pages to the archive. See How To Save URLs To The Wayback Machine On Demand, Gary Price, Search Engine Land (May13)
The US Court of Appeal (2nd. Circuit) has ruled that “full-text book scanning is generally going to be considered “fair use” and protected from claims of copyright infringement”.
Book Scanning Suits Against Google, Others Wind Down With Fair Use Rulings, Greg Sterling, Marketing Land (Jun 11)
This ruling allows 90+ member libraries of the HathiTrust Digital Library to continue their work to digitize their collections for access.
The Baylor University Library webpage provides some background on HathiTrust:
HathiTrust was established in 2008 with the mission to “contribute to the common good by collecting, organizing, preserving, communicating and sharing the record of human knowledge.” The original HathiTrust libraries were partners with Google and/or the Internet Archive for the digitization of books in their collections. In part, HathiTrust was created so these libraries could work collaboratively to manage, provide access to, and preserve their digital assets in ways that Google could not.
DMR – Digital Marketing Ramblings – is loaded with statistics and infographics about nearly every aspect of the Internet – social media, internet usage, browsers, Google, Microsoft, Apple. Get insider tips and learn about gadgets. Lots here to entertain and inform people interested in digital marketing, trends and technology