OCR stands for Optical Character Recognition - the process by which PDF2XL OCR converts images (from scanned documents or image files) into text that can then be converted into Excel.
PDF2XL OCR mostly runs the OCR process page by page: OCR is performed once for each page when you visit it for the first time, and once again while the document is converted.
The process of automatically recognizing text in images is very difficult, especially in a low quality scan, so it's not unexpected for the OCR to have the occasional mistake. That's why PDF2XL OCR includes a process called OCR Validation, where the user is asked to validate some words that the OCR engine is uncertain about. The images of suspected words will be displayed one after the other, with a text box under each containing what the OCR engine thinks is written there, and the user can either accept that result or, if it's mistaken, fix it.
The OCR Validation process is not mandatory, but it's highly recommended to do it before converting the document, in order to avoid mistakes.
It's important to note that sometimes the OCR process itself will have to run again - for example, if you change something in the layout - and this will cancel any validation you might have done before. Therefore it's recommended to fully prepare your layout, including all the details such as column formats, and only then use the OCR Validation.
Tip: Changing column formats from Automatic to a specific option may help the OCR engine read the text correctly. For example, it may be hard to tell the difference between 0 and O, but if it's a numeric column then this won't be a problem.