OCR: CV guide

Preprocess for Tesseract

import cv2

gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
_, th = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
# Optional: morphology to close gaps in strokes
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
th = cv2.morphologyEx(th, cv2.MORPH_CLOSE, kernel)

Deskew, denoise, and contrast normalization often improve line-based OCR more than raw color photos.

pytesseract

# pip install pytesseract; install Tesseract OCR binary on PATH
import pytesseract
from PIL import Image

text = pytesseract.image_to_string(Image.fromarray(th), lang="eng")
data = pytesseract.image_to_data(Image.fromarray(th), output_type=pytesseract.Output.DICT)

image_to_data returns per-word boxes and confidences for debugging.

Scene text (idea)

OpenCV’s DNN module can run frozen EAST text detection or ONNX recognition models: produce quadrilaterals or axis-aligned boxes, warp crops to fixed height, then run a recognition network. Training custom data yields better domain accuracy than generic English-only models.

                    Takeaways
                    Match engine to layout: printed forms vs street signs vs handwriting.
Evaluate with character/word accuracy on a held-out set.
Privacy: redact or avoid storing sensitive text without policy.

                

Quick FAQ

Install the matching tessdata pack and pass lang="hin+eng" (example) to pytesseract.

Tesseract is limited; use specialized HTR models or line-level sequence models trained on IAM-style corpora.

Related Computer Vision Links

Optical character recognition

Preprocess for Tesseract

pytesseract

Scene text (idea)

Takeaways

Quick FAQ

Related Computer Vision Links

Preprocess for Tesseract

pytesseract

Scene text (idea)

Takeaways

Quick FAQ

Wrong language?

Handwriting?