OCR Settings Page

Conversion Output Format Advanced Validated words Curret validated word Background Dictionary Perform OCR on the whole document Font Automatically use OCR mode on Scanned PDF Allow scrolling between pages using the mouse Validate only suspect words Ignore validated words Enable Learning Learn While Converting Keep Learned Knowledge Reset Advanced OCR Settings

This page is one of the pages in the Settings dialog.

This pages contains settings for the Scanned Document Mode view.

The first part of the page contains color settings:

Validated words

The color marking validated words, i.e., words which the user validated.

Current Validated word Background

Background color of currently validated word. Used when in Scanned Document Validation mode only.

The second part contains the OCR process settings:

Dictionary Language

Allows the user to select the language of the dictionary used in the OCR process. Words that do not appear in the dictionary will have a higher chance of being marked as suspect.
If 'all' is selected (which is the default), the OCR process will guess the language of the word.

Perform OCR on the whole document

When this option is set, PDF2XL will automatically run the OCR process on the whole document in the background.
This will make operations like page navigation and text finding faster, but can take some CPU time for large documents. If PDF2XL is mainly used to convert just one or two pages per document and text searching is not necessary, it is recommended to clear this box.

The third part has settings of the Scanned Document's Validation Area:

Font

Allows the user to select the font used by the validation edit box.
Note that the text size is matched to the size of the image in the Word Preview Area.

The forth part displays the pre-OCR test options:

Automatically use OCR mode on Scanned PDF

Check this option to automatically enable the Scanned Document Mode whenever the user selects an area that contains only an image.
If this option is not checked, users will be asked if they wish to enable the Scanned Document Mode.
This option is off by default.

Allow scrolling between pages using the mouse

When this option is set, using the mouse wheel to scroll down when displaying the bottom of a page or up when displaying the top of a page will cause PDF2XL to scroll to the next/previous page.
Clearing the box will prevent this from happening.
This option is on by default.

Validate only suspect words

Check this option to go over suspect words only in the Scanned Document Validation Mode; uncheck to go over all the words.
This option is on by default.

Ignore validated words

Check this option to ignore validated words while in Scanned Document Validation Mode; uncheck to go over all the words, even if validated before.
This option is on by default.

Enable Learning

Setting this option enables the OCR Feedback process.
This option is off by default.

Learn While Converting

This option toggles the OCR Feedback process when converting. It is only used if the Enable Learning option is on. This option is off by default.

Keep Learned Knowledge

If this option is set, the OCR Feedback results will be kept and used when performing an OCR operation again.
If it is off, the results will only be used for the OCR process of the document it was learned from.
This option is on by default.

Reset

This command resets (empties) the learned knowledge file. If used (the user must confirm this action), all the OCR Feedback results results will be deleted.

Advanced OCR Settings

Displays the Advanced OCR Settings Dialog, allowing setting of advanced OCR options.