Skip to Navigation
University of Pittsburgh
Print This Page Print this pages

March 7, 2013

Universal digital library closer

books closeupGoogle’s attempt to digitize every book in existence has spurred two lawsuits that could lead to the universal library that humankind has envisioned since ancient Alexandria — or result in the destruction of the 17 million electronic volumes Google already has created, according to Pamela Samuelson.

“A universal digital library is within reach … We owe it to future generations to try,” was the message the former Pitt law faculty member delivered in her Feb. 19 Sara Fine Institute lecture here. Samuelson, a copyright expert and the Richard M. Sherman Distinguished Professor of Law and Information at the University of California-Berkeley, made the case that Google’s approach “got us closer than ever before to being able to achieve that.”

She added: “It was going to provide more access to information” than any other effort previously attempted. Yet it was “not the appropriate way to bring about that particular aim.”

The first lawsuit, Authors Guild et al. v. Google, Inc., had its origins in 2004, when Google began scanning the full contents of books in major research-library collections, beginning with the University of Michigan’s eight million volumes and the University of California’s three million. Many other state universities signed on to digitize their collections as well, knowing from previous court rulings that their endowments would not be in danger, even if copyright lawsuits ensued.

In fact, said Samuelson, it seemed like the perfect marriage: Google had the vision, technology and resources to digitize every book, but it had no cache of books, while the major research universities had massive library collections but lacked the resources to digitize everything.

After scanning the volumes, Google planned to index their contents to make them searchable through its Google Book Search. The company also planned to make available online, and via download, the full content of books whose copyright had expired. Google contracted with publishers or authors of more current works to make smaller portions of those books readable online as well. Most of the works Google planned to scan were out of print, and would gain new life, albeit in snippets, thanks to Google’s efforts.

To Google, scanning whole works constituted “fair use,” a legal doctrine that allows copying of works for a “transformative” use, such as reviews or parodies, without permission from the owner of the copyright. This usually prohibits wholesale copying or other uses that harm the market for a work. “They had a big bet on fair use,” Samuelson said.

Pamela Samuelson

Pamela Samuelson

For instance, Google asserted, its efforts would be particularly valuable for “orphan works,” books for which the copyright holder could not be found; those books would gain new commercial life as electronic copies. Google Book Search also would provide new audiences for authors by linking searchers to places where in-copyright works could be bought or borrowed legally.

“Google was well aware there were copyright issues there,” said Samuelson. “They wanted to have a corpus of everything — but that’s kind of Google.”

The Authors Guild, representing 8,500 authors, and five commercial or “trade” publishers sued Google in 2005, claiming copyright-law violations and citing Google’s commercial interests in scanning books. They questioned Google’s potential monopoly, “a concern that one private company would have essentially a chokehold on a corpus of what could be tens of millions of books,” Samuelson said. Google would have been able to generate revenue from its Book Search in several ways: from searchers ordering out-of-print book copies, for instance, and from the ads accompanying each search.

Nearly three years after the first lawsuit, a settlement was announced, pending the case judge’s approval. It would have given Google the right to scan works still under copyright protection and to index their contents, selling copies of those books that are out of print (unless the copyright holder objected).

Google also would have been obligated to send a digital copy of each book to its library of origin; create a single public terminal in each library to give access to the digital collection, and make all scanned books available to participating research libraries via subscription. Google even was planning to offer licenses, for a fee, to public schools and other public institutions. Qualifying individuals with visual impairments would have benefited as well from a large number of out-of-print volumes being newly accessible online. The settlement also would have created a book rights registry to which Google would have contributed 63 percent of its subscription fees and book-selling revenues, allowing copyright holders (authors and publishers) to collect royalties through the registry.

“Thousands of people objected to the settlement,” Samuelson said. She filed an objection herself, arguing that the Authors Guild did not adequately represent an even larger class of aggrieved authors. University research libraries, after all, are not filled with Stephen King novels and other popular works but with the scholarly, mostly out-of-print nonfiction works of more than 100,000 academic authors.

The settlement was scuttled in March 2011. The judge in the Authors Guild case said that the settlement had gone too far, essentially changing copyright laws. For instance, it forced copyright holders to raise an objection to stop Google from digitizing their works, rather than requiring Google to seek permission from each copyright holder before scanning. The judge also included Samuelson’s objection that the legal class of commercial authors was too small.

Today, the Authors Guild case still is stuck in the pre-trial stage, while “the argument is still as hot as it was back in 2005,” Samuelson reported. Google continues to argue that clearing copyright on the world’s 137 million books — Google’s own estimate — would be prohibitive.

Meanwhile, more than 10.6 million of the books Google already has scanned are available at the HathiTrust web site ( HathiTrust was formed by 60 of the research libraries to hold their scanned collections. Searching HathiTrust by subject brings up everything from mere catalog listings to reports on the frequency of a subject inside a particular book or journal — either of which can aid research or prompt a book’s purchase — and full views of books no longer in copyright.

HathiTrust is being sued by the Authors Guild as well. The Guild lost its first round in court — the judge ruled that scanning to preserve each work for future generations is a transformative use — but briefs currently are being filed in the author group’s appeal to the Second Circuit Court.

“It’s quite possible that the Second Circuit Court of Appeals will decide the HathiTrust case before the Google case,” Samuelson said, which would have important implications for the Google decision.

“If HathiTrust wins, it means that every library, every nonprofit institution in the country that wants to digitize things” could do so, she said, and “this would be a huge win. It’s not the universal digital library, but it’s getting us closer.”

On the other hand, she said, “If the Authors Guild wins, it’s not impossible that they could impound and destroy a corpus of 17 million volumes.

“The precedents that these cases will set will affect a much wider range of organizations,” Samuelson added, from historical societies to video game archivists.

Still, she concluded, “There are many paths to victory” for the universal digital library.

Licenses issued through the nonprofit Creative Commons ( let individuals maintain their copyrights while allowing others to make noncommercial copies. Some Nordic countries allow a work to be copied under an “extended collective license,” which creates an agency through which copyright holders can collect compensation for public access. Thousands of open-access repositories already allow free retrieval of research and other writings in their collections. And the new UnglueIt service ( lets users vote on which out-of-print books the service should seek — and pay — to make available on an open-access basis.

Samuelson would like to see more universities making more items available via open access. Universities also can help authors terminate their copyrights or use the provision within the copyright law that permits authors to take back copyrights to out-of-print monographs, she said.

To aid in making books and journals more widely and freely available, Samuelson is in the process of forming what she tentatively calls the Authors Alliance to counter the Authors Guild. “Stand by for that.” Most authors, she said, “want to make things available on an open-access basis,” contrary to what the Authors Guild asserted in court.

Federal legislation is not one of those paths to victory, Samuelson said, thanks to a dysfunctional Congress. But she expects the federal copyright office to propose new rules soon, which may help the situation.

“The tragedy of this is, who started ebooks? It wasn’t the publishers. It was Amazon. … We have to find a way to get people over this ‘electronic madness’” or fear of technology, she said, quoting the title of a Sara Fine book.

Samuelson’s lecture was presented by the Sara Fine Institute in the School of Information Sciences (SIS) and the Innovation Practice Institute in the School of Law.

The Fine Institute focuses on studying the manner in which technology affects interpersonal communications, and is named after a former SIS professor. The Innovation Practice Institute fosters collaboration among the law school, practicing lawyers, innovators and entrepreneurs working toward regional economic development.

—Marty Levine

Leave a Reply