[Marxism] Go Blue ! go red : University of Michigan's entire library to be put on Google

Charles Brown cbrown at michiganlegal.org
Wed Dec 15 07:52:30 MST 2004

Subject: Go Blue ! go red : University of Michigan's entire library to be
put on Google

MIKE WENDLAND: U-M's entire library to be put on Google


Billion-dollar project will move text of 7 million volumes online


December 14, 2004



Google, the ubiquitous Internet search engine, is taking the University of

Michigan's library from Ann Arbor to the world.


7,000,000: Volumes in the U-M library to be digitized.

2,380,000,000: Estimated number of pages.

743,750,000,000: Estimated number of words.

1,600: Years it would take U-M to digitize all 7 million volumes without

Google's special technology.

Fewer than 7: Years it will take to digitize the volumes with Google's


$1 billion: Estimated value of the project to U-M.

Source: John Wilkin, associate university librarian, library information

technology and technical and access services, University of Michigan


U-M and the California-based information company will announce an agreement

today under which the complete text of all 7 million volumes in U-M's

library will be digitized -- that is, turned into a computer-readable format

-- and made instantly searchable by anyone using Google.

The massive project means that within a few years, people doing research

about practically anything -- whether for a scholarly paper, a high school

project or a family tree -- will be able to consult U-M's collections online

almost as easily as they could if they were sitting in the landmark library

building on the university's central campus.

It is the largest such digital scanning project ever undertaken, and one

that promises to take online searching far beyond the traditional Web pages,

news and shopping sites that make up most searches today.

"This project signals an era when the printed record of civilization is

accessible to every person in the world with Internet access," said U-M

President Mary Sue Coleman. "It is an initiative with tremendous impact

today and endless future possibilities."

Besides digitizing U-M's massive collection, Google plans to scan parts of

other research libraries, including those at Harvard, Stanford, Oxford

University in England and the New York Public Library. Those projects are

much smaller in scope than Google's plans for U-M. At Harvard, for example,

only 40,000 of the university's 15 million volumes will be digitized.

U-M's library, often ranked among the nation's top 10 research collections,

has been a leader in the drive to convert printed information into digital

form, which scholars say will preserve fragile items and make it easier for

researchers to find the information they want.

During the past several years, the university has scanned about 22,000

volumes, one of the most ambitious digital efforts among U.S. universities.

When Google offered technology that could handle the entire collection, U-M

jumped at the opportunity.

Google has a strong connection to Ann Arbor: Larry Page, one of the

company's two founders, is a graduate of U-M's engineering school. He was

the first recipient of the University of Michigan Alumni Society's recent

engineering graduate award.

The size of the U-M undertaking is staggering. It involves the use of new

technology developed by Google that greatly speeds the digitizing process.

Without that technology -- which Google won't discuss in detail -- the task

would be impossible, says John Wilkin, the U-M associate librarian who is

heading the project.

"Going as fast as we can with the traditional means of doing this, it would

take us about 1,600 years to do all 7 million volumes," he said. "Google

will do it in six years."

Under the agreement, the library will get a digital copy of every book

scanned. With those copies, the library can prepare special research

projects, virtual exhibitions and more relevant scholarly and academic

material for its students and faculty.

"If we were to do this job ourselves, it would probably cost us $600

million," Wilkin said. "That's just the human cost of preparing the material

for scanning, packing it up and sending it out to vendors and then

quality-control checking of the results. This is easily a billion-dollar


Although a few sample volumes were to be made available online today to

highlight the project, significant amounts of material from the library

won't be online until mid-2005. All 7 million volumes should be digitized

into the Google database sometime shortly after 2010.

For Google, digitizing the collection is part of an effort called Google

Print (http://print.google.com <http://print.google.com/>
<http://print.google.com/ <http://print.google.com/> > ), in which the

popular search site is working to create digital databases of books,

reports, manuscripts and other printed materials. The goal is for Web users

accessing the search site to be able to type in a phrase or key words and be

presented with direct access to in-depth research and literary material.

The prospect of expanding that effort to include U-M's 7 million items has

researchers buzzing.

"It's a noble effort, and a huge undertaking," said Gary Price, editor of

ResourceShelf (www.resourceshelf.com <outbind://27/www.resourceshelf.com> ),
a site geared toward information

professionals. "But it's so huge a project that the concern I have is that

people may be lost in a sea of possible links."

Price said he believes the project will lead to similar efforts by Microsoft

and Yahoo.

"Both of them have the money and the expertise to do this," Price said, "and

there are a lot more libraries around the country. They won't want Google to

have this kind of an advantage over them."

Google refuses to say how many people will be at U-M doing the digitizing

work. "All we can say is this is a very large project, and we will be

working on it aggressively," said Susan Wojcicki, Google's director of

program management.

What users will see when they search the U-M collection online depends upon

whether the information is still covered by copyright. For older items,

users will be able to search for and read every word on each page of a book

or document. But for material under copyright, the university will put a

short synopsis of the material online, with information that links to the

publisher or libraries where the work can be obtained.

Contact MIKE WENDLAND at 313-222-8861 or mwendland at freepress.com.


More information about the Marxism mailing list