Confidence threshold

A scanned image, or especially a photograph, may contain areas of varying quality. Some regions may be dim or blurry, some may contain dirt, spots, scratches, glare, unwanted gradients, and other defects. These factors can interfere with recognition and lead to less accurate OCR results or the wrong order of the recognized text.

The complex structure analysis algorithm allows for very flexible filtering of areas with low confidence. You can choose whether to skip such areas and achieve highly reliable OCR results at the cost of losing some of the content, or to extract the maximum amount of text from an image with slightly lower accuracy. This filtering threshold is controlled through dsrConfidence recognition setting:

Alias	Behavior
Default	Synonym for `Mid`.
Low	Extract most of the text except for very low quality areas.
LowMid	Extract most of the text except for low quality areas.
Mid	Skip some blurry or dull areas and image defects. Default behavior that balances recognition accuracy and coverage.
MidHigh	Prefer more accurate recognition at the cost of losing some content from areas of low confidence.
High	Skip most of the blurry areas and image defects.
Ultra	Process only the highest quality areas, skipping all others.
All	Try to extract all the text from the image, regardless of its quality. May lead to less accurate results and artifacts, but can extract text even from illustrations or logos.

If this parameter is omitted, medium (Mid) filtering is used, providing a good balance between recognition accuracy and volume of extracted text. As a general rule, it is a good idea to tweak dsrConfidence only if you are unhappy with the result:

If you get unwanted artifacts or prefer maximum recognition accuracy, try setting more aggressive filtering (Ultra, High or MidHigh).
Keep in mind, that some content blocks will not be recognized.
If some blocks of text do not appear in the recognition results, set the value to Low or LowMid.
To get all the text from the image, regardless of the accuracy, set the value of the parameter to All.

dsrConfidence parameter only applies to complex structure analysis and combined structure analysis modes.

Curved text detection