Earlier this month, the District Court for the Southern District of New York, on remand from the 2nd Circuit, sided with Google in the copyright infringement proceedings that began in 2005 over the Google Books Library Project. Judge Chin, presiding over the case, agreed that Google Books provided “significant public benefit”, and accepted Google’s fair use defense for the scanning of more than 20 million books for an electronic database, and making snippets of the text available for online searches.

Under U.S. Copyright law, a fair use defense may be raised as an affirmative defense to an allegation of infringement. Specifically, four factors under 17 USC 107 are considered in order to determine whether the use made of a copyrighted work is a fair use —

  1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
  2. the nature of the copyrighted work;
  3. the amount and substantiality of the portion used in relation to thecopyrighted work as a whole; and
  4. the effect of the use upon the potential market for or value of the copyrighted work.

Regarding (1) the purpose and character of the use, Judge Chin found Google’s use of the copyrighted works to be “highly transformative”, in that “Google digitizes books and transforms expressive text into a comprehensive word index that helps readers, scholars, researchers, and others find books.” Further, “Google Books is also transformative in the sense that it has transformed book text into data for purposes of substantive research, including data mining and text mining in new areas”. Importantly, Judge Chin found that “Google Books does not supersede or supplant books because it is not a tool to be used to read books.” Although “Google is a for-profit entity and Google Books is largely a commercial enterprise … even assuming Google’s principal motivation is profit, the fact is that Google serves several important educational purposes.”

Regarding (2) the nature of copyrighted works, the fact that “the vast majority of the books in Google Books are non-fiction” and are all “published and available to the public” weighed in favor of fair use.

Regarding (3) the amount and substantiality of the portion used in relation to the work as a whole, Judge Chin found that while full scans are conducted of the entire text of books, such copying may still be fair use. In this case, the full-text search of books is critical to the functioning of Google Books, and more significant, Google limits the amount of text it displays in response to any search. Thus, on balance, the third factor was found to weigh “slightly against a finding of fair use.”

Regarding (4) the effect of the use upon the potential market, Google did not sell its scans, nor did the scans replace the books. To the contrary, “a reasonable fact finder could only find that Google Books enhances the sales of books to the benefit of copyright holders” because Google Books allowed the individual titles to be discovered and become noticed, “much like traditional in-store book displays”. Thus, Judge Chin concluded the fourth factor to weigh strongly in favor of fair use.

The opinion raises an interesting question — are text and data mining activities themselves carved out of copyright protection?  While Judge Chin seems to acknowledge certain value in data mining, that value does not seem to belong to the copyright holders, at least not in this case.  One might imagine that had Google charged a subscription fee to “readers, scholars, researchers” at large, rather than using its ad-supported revenue model, the fair use finding may have been completely different.

Click here for the full Motion for Summary Judgement.