Google now able to search scanned text in PDFs

Posted on October 31st, 2008 in SEO by admin

Companies publishing PDFs containing scanned documents online may see their content better indexed by Google from now on.

According to the search engine, it has developed optical character recognition (OCR) technology that can turn pictures of words in PDF documents into text that can be crawled for search ranking purposes.

Previously, Google was only able to effectively crawl a few of these documents, according to a post by product manager Evin Levey on the Official Google Blog.

“This OCR technology lets us convert a picture (of a thousand words) into a thousand words - words that can be searched and indexed, so that these valuable documents are more easily found,” he commented.

Google is currently working with Adobe to improve the visibility of Flash rich media web content to search engines, the technology for which was recently detailed by Adobe at the AJAX World RIA Conference & Expo, according to InfoWorld.ADNFCR-1989-ID-18852852-ADNFCR

Next Page »