Google Starts Indexing Scanned Documents

Wednesday, November 5, 2008

Google Starts Indexing Scanned Documents

Google has begun indexing scanned documents using the Optical Character Recognition technology that helps the company in converting ‘a picture (of a thousand words) into a thousand words’. According to Google, “This is a small but important step forward in our mission of making all the world’s information accessible and useful.”

“In the past, scanned documents were rarely included in search results as we couldn't be sure of their content. We had occasional clues from references to the document-- so you might get a search result with a title but no snippet highlighting your query. Today, that changes. We are now able to perform OCR on any scanned documents that we find stored in Adobe’s PDF format,” Google said .

Omer Syed - All Things Search..

Wednesday, November 5, 2008