More About Google Print

Prediction: we’ll talk about Google Print until they debut the beta, then we’ll talk about it more.

Copyfight posted some followup on Google’s announcement earlier this week. Of note was a quote from Michael Madison:

A first thought: It’s one more example, and a pretty important one, of the fading of the lines separating copyright law from communications law. Is Google Print an information conduit? A massive, rogue P2P technology? Is it a contributory infringer? A publisher? From whom, if anyone, does it need licenses, and who, if anyone, should regulate it, and how, if at all?

TeleRead started talking about how Google Print will be presented:

My understanding, which may be wrong, is that Google will OCR the page scans, but do only cursory machine cleanup of the raw unstructured text that results. This approach is which I call “raw digital text” or RDT), and use the still-error-laden RDT in their search system to pull up the page scans.

You can see this approach now in the way Amazon presents results of its “search inside this book” feature. The text is indexed for searching, but clicking on the results brings up the scanned, bitmapped pages. When available, the feature is incredibly useful, but I feel cheated when I try to copy and paste the text.

TeleRead points out that this is also how the University of Michigan’s Making of America collection works.

MoA scanned the books, placed the scanned page images online, and built a search engine to search the resulting RDT from OCR. Then, one by one they have been converting the RDT from selected books to highly-proofed SDT (structured digital text) using human proofers and TEI (I think) for structuring. So, the scans came first, and then the cleanup was (and is being) done at a later time.

The excitement here, for TeleRead, is that Google might end up contributing to efforts like Project Gutenberg and could benefit greatly from the Distributed Proofreaders volunteers.

No Comments Yet

No comments yet.

Comments RSS TrackBack Identifier URI

Leave a comment

 

User contributed tags for this post:

calculate btu (143) - google print hack (106) - how to print google books (83) - hack google print (79) - How to calculate BTU (73) - Dell PowerEdge 2650 BTU (53) - PowerEdge 2650 BTU (50) - dell 1850 btu (42) - calculating BTU (41) - google print OCR (41) - hacking google print (40) - printing from google books (36) - google books hack (34) - how to print from google books (32) - printing google books (31) - poweredge 1850 btu (29) - dell 2650 btu (28) - print google books (27) - print from google books (23) - poweredge 750 btu (23) - how to calculate BTUs (21) - dell btu (20) - Dell Poweredge 750 BTU (20) - how to print google book (16) - how calculate btu (16) - calculate btu rating (15) - dell btu rating (15) - hack google books (15) - Dell 1850 BTU rating (14) - how to print a google book (14) - dell poweredge 1850 btu (14) - hacking google books (13) - calculating btus (12) - how do you calculate btu (12) - dell poweredge 2650 btu rating (10) - dell poweredge btu rating (10) - gmail for palm (10) - google print hacking (10) - Dell Poweredge BTU (9) - dell 2650 btu rating (9) - gmail en palm (8) - print google books hack (8) - gmail on palm (8) - google books how to print (8) - Dell PowerEdge 1850 btu rating (8) - print pages from google books (7) - Gmail and Palm (7) - poweredge btu ratings (6) - calculate BTUs (6) - hack google print google print print pages (6) - google print hacker (6) - print google pages hacking (6) - btu poweredge 750 (6) - google books hacking (6) - BTU PowerEdge 2650 (6) - how do i calculate BTu (6) - how to calculate BTU rating (6) - amazon search inside hack (5) - dell 2650 btus (5) - print google hack (5) - google books download (4) - how to calculate btu s (4) - bengali choti book (4) - printing google books hack (4) - calculating BTU rating (4) - dell poweredge 750 BTU output (4) - print google book (4) - google books print hack (4) - how to print pages from google books (4) - poweredge btu rating (4) - print a google book (4) - google books ocr (4) - calculating btu s (4) - googlebooks hack (4) - calculate btu ratings (4) - hack googlebooks (4) - scanned choti book (4) - pc btu rating (4) - hack amazon search inside (4) - Poweredge BTU (4) - dell btu information (4) - calculate BTU s (4) - BTU dell (4) - BTU Dell 2650 (4) - btu dell 2650 poweredge (4) - BTU Dell PowerEdge 2650 (4) - Bring troops home car magnet (4) - hacking print google (4) - Hacking Google Print download (3) - btu dell 1850 (3) - hacking amazon search inside (3) - DELL PC BTU (3) - btu rating dell 2650 (3) - pe2650 BTU (3) - HOW TO HACK BOOKS.GOOGLE (3) - Dell Poweredge BTU ratings (3) - poweredge 2650 btu rating (3) - search inside this book hack (3) - dell btu ratings (3) - scanned choti (3) - amazon search inside hacking (3) - Michigan Parkour (3) - google books print (3) - hack print google (3) - print googlebooks (3) - how to print google books hack (3) - google book how to print (3) - Dell PowerEdge 2650 BTU s (2) - amazon inside this book copy paste (2) - how to print google book pages (2) - hack books.google (2) - ocr google print (2) - how to copy or print google books? (2) - ocr google print images (2) - scanned bengali choti (2) - btu calculate palm (2) - bengali choti pdf (2) - Personal computer Btu rating (2) - calculate btu m3 (2) - hack print google books (2) - google print beta hack (2) - calculate BTU of Home (2) - cache 2PhGRWU3OZIJ maisonbisson com blog post 10358 btu (2) - btu for dell 750 (2) - dell 750 BTU rating (2) - teleread (2) - m3 (2) - btu s for poweredge 2650 (2) - pe 750 btu (2) - BTU for Dell Power Edge 750 (2) -