any newer (08 I think) Ricoh can do it provided you install the Java VM card which is around $170 plus the software which could be NSI auto store express for around $1300 or X-Solution's Digidocflow for around $1,000.

Funny you asked about sharp - I've never sold it before so I don't know if sharp desk is an add on or if its standard or if you can buy higher end versions of it but they've been doing it for a while.
SharpDesk is a value-added technology that comes with the equipment. This is a very powerful application. One of the reasons why Sharp has been product line of the year 2 out of the last 3 years from BLI. It would be a rare event that someone would want a searchable pdf right on the unit.....that is much easier managed at the workstation.

Although Xerox and Konica Minolta offer Searchable PDFs created within the copier does anybody have any real world experience how good these onboard features native to the copier are at processing larger more complex documents.  I suspect they are probably acceptable for smaller 1-2 page documents that comprise mostly of text but what happens to their accuracy when processing more complex 10+ page documents.


I do not see how a built in Xerox/Konica Minolta OCR engine could compete in both speed and accuracy with industry leading OmniPage or Abbyy OCR engines that run on PCs with more powerful processing capability than a copier could have.

That's the rhetoric I've always heard from Ricoh when bringing up deals in which I had to add middleware to add this functionality & lost because of the price gap that creates when making those additions across of fleet of 5+ machines.  In reading the literature on the new Ricoh MP C3003/C3503/C4503/C5503/C6003, it looks like Scan to Searchable PDF will be standard on those models.  I look forward to seeing how well/quickly it works compared to NSI Autostore, et al.

Well at least my sales spin is not so different from Ricoh's.  I would be interested to hear about your test results when you have a chance.


Ricoh must have been taking some hits and losing some sales on this knock out feature.


IT Depts like the idea of them not having to manage 3rd party software to create Searchable PDFs.


The question remains how happy users will be with the quality and speed that a MFP's on board processor can create when dealing with longer, more complex documents.

I've spoken to Monte (fellow P4P'er) on this subject and his take is that most of these units will choke when trying to process large files (many documents to be scanned).  The other day I had an instance with an HP plotter where the customer had placed the documents on the USB drive and was then trying to print from the USB drive, after 2 hours they gave up.


I will also test when it comes out and I'm hoping I'm done before December!!!


Canon: Finding this topic of interest, I decided to run a test on one of our demo room devices, a Canon ImageRunner Advance C2230. While a newer model, this family of IRAdvance does not offer all of the horespower and features of the larger and more costly models. It does feature PDF (searchable) as a scanned file format. So, I placed a memory stick in the USB port, chose B&W 300dpi pdf searchable. I placed a "typical" office document (actually, my Treeno EDM Administrator's Guide) in the ADF, and scanned 50 pages. Result: File scanned, written to USB, and completed in 3 minutes 15 seconds. File size: 3.1MB.

I then opened the file on my Mac using Adobe Reader and it is indeed searchable.


Yes- I was impressed by this simple test. The Treeno EDM manual is as you described: easy to read font, layout that includes tables and simple graphics. I will repeat the test with a 100 page document as soon as I can find something that's more similar to a customer's document. We need a standard document to test; anyone have a 100 page Slerex letter?!!

Allot of our solutions have text searchable .PDF's as a requirement and I spend allot of time in the legal vertical where its crucial. All the newer HP MFP's have "flow" versions which create text searchable .PDF's do background clean up etc. So essentially from my experience if it's low volume convenience scanning then sure do it using the machines embedded OCR capabilities. But if its volume scanning this wont cut it, you need to use DSS/Autostore etc and even then you can get bottlenecks depending on the number of OCR engines available.


But here is a thought:


Recently we had a customer with high volume OCR requirements, what actually worked out to be a better solution was to drop in non text searchable .PDF's so they are available immediately and use a content crawler software to process and convert all the documents in their ECMS to searchable.

Triple thanks for your helpful reply!


I am currently working with trying to scan a large pile of approx 20 page documents each of client data into searchable PDFs.  The copier is scanning directly to a Omnipage Pro 16 OCR  engine.There are a lot of different content pages including application forms with lots of lines, boxes, changing fonts, logos and handwriting.  Omnipage is having a hard time with accuracy and quality so it seems fair to presume that trying to complete this task via a copier's on-board processing capabilities would likely result in unhappy customers.


The content crawler is a great idea I will have to ponder.  Can you recommend a specific crawler for me to look at?

One thing that I am doing that the customer likes is the insertion of a bar code seperator page.  I have stored a predefined bar code in the eFiling box of the copier. Its content does not change. The operator can print off as many bar  codes as they need.  The operator inserts these paper bar codes where appropriate into a stack of documents waiting to be scanned.  In a stack of documents every time the OCR engine sees this bar code, it knows to create a new file.  We a currently just scanning as PDFs, not searchable, for the highest accuracy.

