Google Book-Scanning Efforts Spark Debate from the Associated Press is an
how the rivalry between Google’s library scanning project and that of the
Open Content Alliance —
backed by Yahoo and Microsoft — is getting more heated. Google pretty much
comes off as the evil company trying to lock up books for its own commercial
goals. I’ll try to restore some balance to that. But then again, perhaps the
rhetoric is the only thing that will make Google decide it should figure out a
way to better assure people that the scanning will be as open source as
The OCA, and in particular Brewster Kahle of the Internet Archive that’s also
behind the project, seems to be ramping up the accusations that Google is running
a closed system that goes counter to Google’s "Don’t Be Evil" philosophy. From the article:
"They don’t want the books to appear in anyone else’s search engine but their
own, which is a little peculiar for a company that says its mission is to make
information universally accessible," Kahle said.
He said similar things last month, and far more strongly. From the
of a video Philipp Lenssen made at Google Blogoscoped:
Pretty much Google is trying to set themselves up as the only place to get to
these materials; the only library; the only access. The idea of having only one
company control the library of human knowledge is a nightmare. I mean this is
1984 – a book about how bad the world would be if this really came about, if a
few governments’ control and corporations’ control on information goes too far.
Wow. I’ve got great respect for Brewster, but I think making this out into
some 1984 info control scenario is going too far. For its part, Google
None of Google’s contracts prevent participating libraries from making
separate scanning arrangements with other organizations, said company
spokeswoman Megan Lamb.
Aside from cutting separate scanning deals, I believe the agreements Google
has with libraries gives them copies of what Google has scanned to do with as they wish. So I think it’s a stretch
to say Google’s trying to keep everything for themselves. But Google still comes
across the crass commercial one in all this:
The motives behind Google’s own book-scanning initiative aren’t entirely
altruistic. The company wants to stock its search engine with unique material
to give people more reasons to visit its website, the hub of an advertising
network that generated most of its $2 billion profit through the first nine
months of this year.
Despite its ongoing support for the Open Content Alliance, Microsoft
earlier this month launched a book-scanning project to compete with Google.
Like Google, Microsoft won’t allow its digital copies to be indexed by other
Microsoft gets a slight nod at not perhaps so altruistic, but lets be more
blunt. This month’s
launch of Microsoft’s
Live Search Books
(gad, what are with these terrible names!) was for all the same commercial
reasons Google has. There’s information in books. Providing access to
information has been proving a money maker.
Note the part about Microsoft not allowing its digital copies to be indexed.
I think this and a similar reference to Google is talking about preventing
spiders from crawling the respective book search sites, to automatically
download PDF files. That wouldn’t be useful anyway. You need the associated
index that’s making the *images* of these books searchable.
Having tossed out some bones of balance Google’s way, let me jump back in on
the side of greater cooperation and openness that the OCA is pitching.
dearly wish Google would get together with them and other scanning projects and
come up a real, open way to index this material. I’ve
about concerns we’re
going to have wasteful, duplicated efforts and some type of VHS/Betamax battle
of digital book formats.
and others have voice concerns as well. The AP story also touches on this:
But some of the participating libraries may have second
thoughts if Google’s system isn’t set up to recognize some of their digital
copies, said Gregory Crane, a Tufts University professor who is currently
studying the difficulty accessing some digital content.
For instance, Tufts worries Google’s optical reader
won’t recognize some books written in classical Greek. If the same problem
were to crop up with a digital book in the Open Content Alliance, Crane thinks
it will be more easily addressed because the group is allowing outside access
to the material.
The battle shaping up over book scanning is unfortunate. The books out of copyright aren’t
Google’s books — they aren’t the OCA’s books — they aren’t the library’s
books. They’re OUR books. Get it together, everyone, and sort something out. Plus, I’d still like to see Google stop scanning books that are in copyright
without express permission to help ease the concerns publishers are having. More
on that in my past post, Search
Engines, Permissions & Moving Forward In Copyright Battles.