Friday, February 17, 2006

Google Book Search--the Fair Use Debate

I was not the only member of the audience who gasped at Mary Sue Coleman's keynote address to the American Association of Publishers/Professional & Scholarly Publishing Division Annual Conference (AAP/PSP). Coleman, President of the University of Michigan, gave a spirited defense of Google Book Search. (You can see the text of her address here, courtesy of Battelle's Searchblog.)

She was articulate, passionate, and a staunch advocate of the University of Michigan's position of greatness in society, as you might expect. She was even persuasive about the business advantages to publishers of participating with Google in this "legal, ethical, and noble endeavor" that will save our society from the ruin of failing to preserve our intellectual history.

What the text of the speech doesn't convey, is the dialog following her talk. I'm particularly hung on up what I thought of as her naive understanding of fair use: President Coleman stated unequivacably that her lawyers (one of whom she pointed out in the front row) have assured her that any person can walk into any library, and legally, under fair use, make a copy of any book in the library for any purpose. So, by extension, Google can make copies of entire research libraries under fair use.

Shocked and apalled, I went to the web to see where this might be coming from. And, I found no less a luminary than Stanford University's Digital Copyright guru and open access advocate Larry Lessig's 30 minute presentation explaining why Google Book Search is protected under the fair use provisions of our copyright laws.

Larry's presentations are always interesting (do you remember his keynote at the SSP meeting in San Francisco a few years ago?), and this one is given in the same, style, complete with understated slides, bald, white courier type on black background, emphasizing key words of his arguments.

Now, I'm hesitant to take on Professor Lessig, with not even a law degree to my credit, but I feel compelled.

Lessig argues that the copying Google does to prepare the indexes and the short "snippets" of content that users will be able to see of copyrighted works, is not wholesale copying but transformative--just as making thumbnail copies of images for a catalog. (He sights relevant case law to back up this assertion.) I have to wonder if you might also think of it as a derivative work, which is protected by copyright.

More compelling, he walks through the 4-prong copyright test that alleged copyright infringers must pass in order to be found to be operating under fair use. Here's my inexpert summary of the prongs and some examples:

1. The nature of the work (for example, works of fiction and high creativity tend to have more protection than compilations of facts)

2. The nature of the use of the copyrighted material (so, for example, if I copied material and sold it for profit, I would be less likely to pass this test than if I used it to save my dying mother or to educate my students or to incorporate it into a work of satire or literary criticism)

3. The amount and substantiality of the use (so it might be OK for me to copy 400 words of a 400 page book, but it might not be OK to copy an entire haiku that is only 10 words)

4. The effect the copying will have on the market for the work (so it might be OK for me to make a copy of an article from a professional journal or trade magazine for a group of students if it wasn't a resource they would typically buy, but it might not be so good for me to make a copy of a chapter of a textbook and put it on a university e-reserve server or course web site so the students didn't have to buy the book).

The maddening thing about the fair use prongs is you have to pass all of the tests. Equally difficult is that only a court can determine if something is fair use or not. So, of course, people (OK, lawyers) write all sorts of guidelines to keep honest folk out of trouble, which makes everybody convinced they know whether or not a particular circumstance constitutes fair use. (Because fair use is a defense rather than a right, Larry has argued elsewhere--convincingly--that the threat of legal action is curtailing the boundaries of fair use and stifling creativity.)

Let's take a look at the prongs with regard to Google Book Search.

1. The nature of the work: The unfathomable breadth of the copying Google must do to digitize the whole collections of the participating libraries means that works of every nature will be copied--Test failed.

2. The nature of the use: Google's defenders say that the preservation of works that are crumbling to dust will save scholarship: that the ability to search inside once obscure titles will revive interest in the titles; that enhanced discoverability will promote book sales. Google's critics counter that Google is a commercial organization that will monetize or has the potential to monetize works created by other parties without compensating them or getting their permission.

Rather than just jump on the bandwagon in a knee jerk publisher kind of way, I checked out Google Book Search myself. My search for "Lessig" showed a first result for his book Free Culture. (See my review here.) The publisher, Penguin, allows Google to display the table of contents. In order to view the TOC, however, I had to sign in using my Google account. Don't have a Google account? Sign up here. Now Google has my name, which they can use to monetize their relationship with me by letting me know about their other products, which expose me to their advertisements. Why? Because I wanted to use Google to access another publisher's content.

Frustrated, and not knowing that the full text is available for free elsewhere on the web), I might choose to follow a link to Amazon, Barnes & Noble, or Penguin to purchase the book. Aha! the defenders chortle: you see, this is GOOD for publishers: you wouldn't have bought the book if it hadn't been for this search.

Perhaps. But Google has an opportunity (a tempting one) to forge affiliate or other relationships with the booksellers to get a piece of the sale, again, monetizing their relationship with me and their unpermissioned copying of the entire book, which allows them to offer the full text search feature.

We might believe President Coleman when she assures us that the University of Michigan will keep copyrighted works in the dark archive (inaccessible until they are in the public domain or otherwise need to be used legally), but without a license agreement or permissions, how do publishers know that big commerical Google won't decide to use this data for some other purpose that might not be transparent to the public or the copyright holders? Perhaps, if the University of Michigan itself had created a massive digitization project, and outsourced it to Google but maintained control over the files, then perhaps this test would pass. But, no.

Test failed?

3. The amount and substantiality of the work copied. Defenders argue that Google will only display a "snippet" of the work to users, so test passed.

Not so fast though. In order to provide the service at all, Google had to make a complete copy of the entire work and is storing and might use it in any way it sees fit. This is a dangerous precedent to set for other potential copiers of the work.

Larry also emphasized a number of times that less than 10 percent of the material being digitized by Google is in print and in copyright. If he was asserting that the small proportion of the total works being digitized being in copyright and in print indicates the amounts and substantiality test is passed, I must disagree. The test was designed to apply to individual works, and not to the aggregation.

Test failed?

4. The effect of the use on the market: As we have seen, Google and its supporters argue that publishers will sell more books and make more money by being discoverable in the gigantic index.

Publishers counter that that might be true, but they should have a choice of marketing and distribution strategies. Google Book Search could work for publishers, or it could compete with their other plans. For example, scholarly publishers, now that they have their journals online and available are building online ebook archives and the hosting systems to enable search and discovery and use. Google would probably augment that. But, shouldn't it be up to them to decide how to expose their content? Suppose a publisher wanted to transfer copyright or sign a license deal with Yahoo or another partner to digitize it's content, but the partner insisted on an exclusive license. The existance of the publisher content in Google Book Search could interfere with that use.

Larry's most elaborate argument, and one that (it seems to me) calls for the most activist judicial interpretation of the test, is that getting permissions from thousands of copyright holders is virtually impossible, the mark of a market failure of massive proportions. (Such permissions are particulary difficult because so much of the work in the libraries being digitized is in copyright but out of print, and many of the copyright publishers are difficult if not impossible to reach.) So, since the market has failed to enable scalable copyright permissioning, it is fair use for Google to disregard the requirement to get copyright holders' permission. And, Larry adds, Google has provided a simple way to opt out if publishers don't want to participate.

Publishers don't think they should have to opt out in a system that requires their permission.

He has definitely identified an important public policy issue. What about works that are out of print but in copyright? What about the public good? If this problem needs to be solved, perhaps legislation is in order. Perhaps the Copyright Clearance Center has a role to play. After all, it is their business to enable an efficient market in copyrights.

Because the permissions problem is a difficult one, does that mean we waive the copyright laws?

Test failed?

I for one will be very interested in the court's interpretation of the AAP and publishers case against Google.

But isn't it a shame that Google handled this project in such a high-handed way so that the publishers felt they had no choice but to sue in order to protect their rights? If indeed this project benefits EVERYBODY: society, researchers, authors, publishers, and yes, Google, wouldn't it have been easy to persuade publishers to agree to the project rather than saying: we're doing it anyway, come get us?

1 comment:

Anonymous said...

It should be noted that the four "tests" or "prongs" to the fair use analysis that you discuss are actually NOT mandatory, as you assert. These are all factors, each of which should be considered by the court, but no single one of them MUST be satisfied. In addition, in weighing and balancing these four factors, courts traditionally afford the most weight to the fourth factor, that of market effect. While it appears that you believe all four factors weigh against Google in any event, this correction likely wouldn't affect your ultimate conclusion, but it should probably be noted nonetheless.