Perform OCR on an image of typed text

You can use tesseract, an OCR command-line program, to convert an image of typed text to plain text (txt) like this one:

Here is the terminal command to perform OCR on the example image:

tesseract image.png stdout --psm 12 --dpi 70 > output.txt 

which will output the following text:

You could also apply the tesseract command (but directing the output to the terminal) to a scanned document like this one:

which would output the following text:

For more information about the various command line options use tesseract --help  or  man tesseract.



Image sources
  • Wikipedia contributors. "Optical character recognition." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 27 May. 2021. Web. 31 May. 2021.
  • Andrews' Book & Job Printing Office. "The Macon directory for 1860, containing the names of the inhabitants, a business directory and an appendix of much useful information." Washington Memorial Library (Macon, Ga.). 1860,


Popular posts from this blog

Install the MAMP (Mac, Apache, MySQL, PHP) stack

Deactivate conda's base environment on startup

Product review: SMONET wireless security camera system