Deep World: Google's Moon Shot

Jeffrey Toobin of the New Yorker writes about Google's Book Search. Here is an excerpt-

"Every weekday, a truck pulls up to the Cecil H. Green Library, on the campus of Stanford University, and collects at least a thousand books, which are taken to an undisclosed location and scanned, page by page, into an enormous database being created by Google. The company is also retrieving books from libraries at several other leading universities, including Harvard and Oxford, as well as the New York Public Library. At the University of Michigan, Google’s original partner in Google Book Search, tens of thousands of books are processed each week on the company’s custom-made scanning equipment.

Google intends to scan every book ever published, and to make the full texts searchable, in the same way that Web sites can be searched on the company’s engine at google.com. At the books site, which is up and running in a beta (or testing) version, at books.google.com, you can enter a word or phrase—say, Ahab and whale—and the search returns a list of works in which the terms appear, in this case nearly eight hundred titles, including numerous editions of Herman Melville’s novel. Clicking on “Moby-Dick, or The Whale” calls up Chapter 28, in which Ahab is introduced. You can scroll through the chapter, search for other terms that appear in the book, and compare it with other editions. Google won’t say how many books are in its database, but the site’s value as a research tool is apparent; on it you can find a history of Urdu newspapers, an 1892 edition of Jane Austen’s letters, several guides to writing haiku, and a Harvard alumni directory from 1919.

No one really knows how many books there are. The most volumes listed in any catalogue is thirty-two million, the number in WorldCat, a database of titles from more than twenty-five thousand libraries around the world. Google aims to scan at least that many. “We think that we can do it all inside of ten years,” Marissa Mayer, a vice-president at Google who is in charge of the books project, said recently, at the company’s headquarters, in Mountain View, California. “It’s mind-boggling to me, how close it is. I think of Google Books as our moon shot.”

Google’s is not the only book-scanning venture. Amazon has digitized hundreds of thousands of the books it sells, and allows users to search the texts; Carnegie Mellon is hosting a project called the Universal Library, which so far has scanned nearly a million and a half books; the Open Content Alliance, a consortium that includes Microsoft, Yahoo, and several major libraries, is also scanning thousands of books; and there are many smaller projects in various stages of development. Still, only Google has embarked on a project of a scale commensurate with its corporate philosophy: “to organize the world’s information and make it universally accessible and useful.”

In part because of that ambition, Google’s endeavor is encountering opposition. A federal court in New York is considering two challenges to the project, one brought by several writers and the Authors Guild, the other by a group of publishers, who are also, curiously, partners in Google Book Search. Both sets of plaintiffs claim that the library component of the project violates copyright law. Like most federal lawsuits, these cases appear likely to be settled before they go to trial, and the terms of any such deal will shape the future of digital books. Google, in an effort to put the lawsuits behind it, may agree to pay the plaintiffs more than a court would require; but, by doing so, the company would discourage potential competitors. To put it another way, being taken to court and charged with copyright infringement on a large scale might be the best thing that ever happens to Google’s foray into the printed word."

Google Search

Google Pack

Deep World

Wednesday, September 05, 2007

Google's Moon Shot

No comments:

Labels

One of my favorites

Blog Archive

links