PDF to Excel

PDF to Excel OCR

PDF to Excel Ent.

PDF to Excel CLI

Update

PDF2XL

PDF2XL OCR

PDF2XL Ent.

PDF2XL CLI

Checkout

Upgrade

Corporate

Update

Contact us

Press Room

Privacy policy

Legal notice

PDF2XL OCR Online Help

Table of Contents User Interface Dialogs Settings Advanced OCR Settings Dialog

Advanced OCR Settings Dialog

Accuracy: Speed vs. Recognition Accuracy: Perfrom whole-page OCR when converting Advanced image cleaning Threshold Despeckle Remove Lines Defaults

This dialog allows you to set the advanced OCR options for PDF2XL OCR. We recommend that you only change this setting if you get poor results with the default settings.

The following options are available:

Accuracy: Speed vs. Recognition

You can choose how much time PDF2XL OCR will spend on trying to convert the scanned image into text. Leveraging the slider bar towards the 'speedy' (left) side will result in faster conversion; moving it towards the 'recognition' (right) side will result in a better text recognition.
If the results of the OCR process are satisfactory, but the process is slower then you'd like, we'd recommend moving the bar to the left; if the OCR results are poor, consider trying to move the bar to the right to provide the recognition process with priority.

Accuracy: Perform whole-page OCR when converting

Checking this option will ensure that PDF2XL OCR will perform the OCR operation on the whole page when converting, as opposed to performing OCR only on the parts of the page that appear in the layout if it is cleared.
Checking this option will make the conversion slower, but in some cases can make it a lot more accurate.

Advanced image cleaning

Setting this option will allow you to change PDF2XL OCR's advanced image cleaning options. It will also take a little longer to process each scanned page's image, so the OCR process itself will be slower.

Threshold

You can either allow PDF2XL OCR to select an automatic monochrome threshold, or set it manually. Change this setting if the scanned page is either very light or very dark.

Despeckle

By setting this option you can make the OCR process ignore small dots and imperfections in the scanned image. If the scanned document has a lot of 'noise', this option can help enormously.
To use it, check the despeckle box, and select the maximum size of the dot to remove. Moving the bar to the right will make PDF2XL OCR remove larger and larger 'dots', up to removing quite sizable chucks.

Remove Lines

If this option is set, PDF2XL OCR will try to remove vertical and horizontal lines before processing the image. This is mostly useful when trying to process an image scanned from old computer print-out papers that have pre-printed lines on them.

Defaults

Clicking this button will reset the advanced OCR settings to PDF2XL OCR's factory settings. You will be asked to OK the change (of course, you can always cancel the changes before they are applied using the Cancel button).


Additional Site Links:

Important PDF and Excel sites:

2009 Cogniview Ltd. All rights reserved