Early this week, a federal court in the US started hearing arguments in a copyright infringement case involving four major book publishers on one side and Internet Archive, a non-profit digital library, on the other. The book publishers-- Hachette Book Group, John Wiley & Sons, Penguin Random House, and HarperCollins-- claim that Internet Archive is engaging in “mass copyright infringement” by scanning hard copies of their books and distributing them as ebooks on its “Open Library” for anyone to read for free.
What is the bone of contention?
Open Library was started by Internet Archive in 2006 to make out of print books available to online readers by scanning and uploading them. Over the years, the library has expanded to include many of the commercially available books. However, it
functioned as a traditional library and allowed registered users with library cards to borrow books based on the number of copies available under the Controlled Digital Lending program. This meant that only one user could access a copy of an ebook at a time while others had to wait in line for it to become available.
These restrictions provided Open Library the safe harbour protection meant for public libraries under the US copyright laws. Internet Archive decided to lift these restrictions in March 2020 to support displaced learners after the Covid-19 outbreak, which led to the shutting down of schools and colleges across the country. They announced a National Emergency Library, which gave free and unlimited access to over 1.3 million books on the Open Library.
The book publishers argue that what Internet Archive is doing amounts to piracy as it is producing copies of millions of unaltered in-copyright works, including books that were published in the last few years. The publishers said that they have an agreement with several public libraries which buy their print books and license ebooks or agree to terms of sale for ebooks from publishers, through book wholesalers or ebook aggregators.
How is Internet Archive defending Open Library?
Though Internet Archive shut down the National Emergency Library in June 2020 in the light of the lawsuit, the publishers are still pursuing the case as they believe that even the controlled digital lending program is in violation of copyright laws as Internet Archive is not licensing the ebooks directly from the publishers or ebook aggregators. Instead, they are scanning and converting print copies of books, many of which are recently published and commercially available as ebooks. They claim that it's hurting publishers and authors as they are getting no compensation from Internet Archive for the “reproduction and distribution” of their books in digital form.
In its defense, Internet Archive has denied most of the allegations and argued that digitising and lending books online makes them accessible to marginalised communities.
It claims that books published before 1924 can be downloaded without any restriction, while readers can borrow contemporary ebooks through the Controlled Digital Lending program. The non-profit body further said that it has been digitising lawfully purchased or donated books in partnership with hundreds of other libraries since 2011.
Several Internet advocacy groups including Fight for the Future have come out in support of Internet Archives and digital lending. Electronic Frontier Foundation (EFF), which is also defending the Internet Archive in the case against the publishers, said
that libraries have paid billions of dollars to publishers for the printed copies of the books and are trying to preserve them by digitising them.
The implication of the case on the non-profit’s other platforms
Founded in 1996, Internet Archive provides several online services in the public interest. It includes the Wayback Machine, Software Archive, Open Content Alliance, and Scanning Services, which are used by students, researchers, governments, and even enterprises.
The non-profit employs 150 people and has generated over $150 million in revenue in the last 10 years, a major share of which comes from donations, according to the documents filed in the lawsuit.
Its most popular platform Wayback Machine, which is a digital archive of billions of published web pages, also serves as a large-scale data source for researchers. Many of the webpages that were blocked by ISP or are no longer available on the source website can still be accessed on Wayback Machine. Similarly, Open Content Alliance was started by tech firms, nonprofits, and government agencies to build a permanent archive of multilingual digitised text and multimedia material.
Internet Archive claims that the vast collection of digitised books available through Open Library has been used by data scientists to do computational analysis of texts to gain insights into the human mind. Open Library offers a suite of application programming interfaces (API) to help developers to leverage their data. It includes RESTful APIs, which make its data available in JSON, YAML, and RDF/XML formats.
For instance, the National Library of Australia uses it to supplement search engine findings. It is connected to Open Library when public domain books turn up in searches, and displays links to Open Library.
Internet Archive has also converted 130,000 references to books in Wikipedia into live links to 50,000 digitised books from Open Library in several Wikipedia pages in multiple languages.
Clearly, the lawsuit will not only impact readers but also individuals and firms relying on open-source projects like Open Library for reference, information, and research.