Enjoying this site? Please to help keep the Snapstats.org lights on.
Description
Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages "out of the box". It can be trained to recognize other languages. Tesseract supports various output formats: plain-text, hocr(html), pdf.
If you want to access the files under /media/* or /run/media/* you'll have
to connect the snap to the core
snap's removable-media
interface:
$ sudo snap connect tesseract:removable-media