More About Google Print

Prediction: we’ll talk about Google Print until they debut the beta, then we’ll talk about it more.

Copyfight posted some followup on Google’s announcement earlier this week. Of note was a quote from Michael Madison:

A first thought: It’s one more example, and a pretty important one, of the fading of the lines separating copyright law from communications law. Is Google Print an information conduit? A massive, rogue P2P technology? Is it a contributory infringer? A publisher? From whom, if anyone, does it need licenses, and who, if anyone, should regulate it, and how, if at all?

TeleRead started talking about how Google Print will be presented:

My understanding, which may be wrong, is that Google will OCR the page scans, but do only cursory machine cleanup of the raw unstructured text that results. This approach is which I call “raw digital text” or RDT), and use the still-error-laden RDT in their search system to pull up the page scans.

You can see this approach now in the way Amazon presents results of its “search inside this book” feature. The text is indexed for searching, but clicking on the results brings up the scanned, bitmapped pages. When available, the feature is incredibly useful, but I feel cheated when I try to copy and paste the text.

TeleRead points out that this is also how the University of Michigan’s Making of America collection works.

MoA scanned the books, placed the scanned page images online, and built a search engine to search the resulting RDT from OCR. Then, one by one they have been converting the RDT from selected books to highly-proofed SDT (structured digital text) using human proofers and TEI (I think) for structuring. So, the scans came first, and then the cleanup was (and is being) done at a later time.

The excitement here, for TeleRead, is that Google might end up contributing to efforts like Project Gutenberg and could benefit greatly from the Distributed Proofreaders volunteers.

No Comments Yet

No comments yet.

Comments RSS TrackBack Identifier URI

Leave a comment

 

User contributed tags for this post:

calculate btu (143) - google print hack (106) - how to print google books (83) - hack google print (79) - How to calculate BTU (73) - Dell PowerEdge 2650 BTU (53) - PowerEdge 2650 BTU (50) - dell 1850 btu (42) - calculating BTU (41) - google print OCR (41) - hacking google print (40) - printing from google books (36) - google books hack (34) - how to print from google books (32) - printing google books (31) - poweredge 1850 btu (29) - dell 2650 btu (28) - print google books (25) - print from google books (23) - poweredge 750 btu (23) - how to calculate BTUs (21) - dell btu (20) - Dell Poweredge 750 BTU (20) - how to print google book (16) - how calculate btu (16) - calculate btu rating (15) - dell btu rating (15) - hack google books (15) - Dell 1850 BTU rating (14) - how to print a google book (14) - dell poweredge 1850 btu (14) - hacking google books (13) - how do you calculate btu (12) - calculating btus (12) - dell poweredge 2650 btu rating (10) - gmail for palm (10) - google print hacking (10) - dell poweredge btu rating (10) - Dell Poweredge BTU (9) - dell 2650 btu rating (9) - gmail en palm (8) - print google books hack (8) - gmail on palm (8) - google books how to print (8) - Dell PowerEdge 1850 btu rating (8) - print pages from google books (7) - Gmail and Palm (7) - hack google print google print print pages (6) - how do i calculate BTu (6) - google print hacker (6) - calculate BTUs (6) - BTU PowerEdge 2650 (6) - print google pages hacking (6) - btu poweredge 750 (6) - how to calculate BTU rating (6) - google books hacking (6) - poweredge btu ratings (6) - print google hack (5) - amazon search inside hack (5) - dell 2650 btus (5) - hack googlebooks (4) - BTU Dell PowerEdge 2650 (4) - BTU dell (4) - hack amazon search inside (4) - how to print pages from google books (4) - Poweredge BTU (4) - how to calculate btu s (4) - calculating BTU rating (4) - printing google books hack (4) - google books ocr (4) - scanned choti book (4) - calculating btu s (4) - print google book (4) - pc btu rating (4) - BTU Dell 2650 (4) - calculate btu ratings (4) - btu dell 2650 poweredge (4) - googlebooks hack (4) - dell btu information (4) - Bring troops home car magnet (4) - hacking print google (4) - google books download (4) - poweredge btu rating (4) - google books print hack (4) - bengali choti book (4) - print a google book (4) - calculate BTU s (4) - dell poweredge 750 BTU output (4) - Hacking Google Print download (3) - google book how to print (3) - google books print (3) - how to print google books hack (3) - pe2650 BTU (3) - HOW TO HACK BOOKS.GOOGLE (3) - dell btu ratings (3) - Dell Poweredge BTU ratings (3) - scanned choti (3) - DELL PC BTU (3) - print googlebooks (3) - Michigan Parkour (3) - hack print google (3) - btu rating dell 2650 (3) - amazon search inside hacking (3) - search inside this book hack (3) - poweredge 2650 btu rating (3) - btu dell 1850 (3) - hacking amazon search inside (3) - BTU RATING FOR DELL 1850 (2) - googlebooks hacks (2) - print google books hacking (2) - google print hack p2p (2) - scanned choti boi (2) - BTU for Dell Power Edge 750 (2) - dell 750 btu (2) - pe750 btu (2) - google print debut (2) - google print hacks (2) - btu s for poweredge 2650 (2) - print hack google (2) - bengali choti pdf (2) - cache 2PhGRWU3OZIJ maisonbisson com blog post 10358 btu (2) - btu for dell 750 (2) - Amazon search inside printing hack (2) - power edge 750 btu (2) - hack books.google (2) - scanned bengali choti (2) - google print search hack (2) - Dell PowerEdge 2650 BTU s (2) - hack amazon inside (2) -