OCR | ARTECHNE database

Optical Character Recognition (OCR) is a technology that extracts text from an image or scanned document, so that it can be searched and indexed. We use OCR software that has a special module for historical texts called ABBYY FineReader. Although this works very well for printed works after roughly 1800, works from before that date are often hard to read using OCR as the printing process was far less standardized before the nineteenth century. This means that the texts that result from 'reading' early modern works using OCR need manual correction in order to make them searchable in the database. It therefore takes some time before we can add more early modern sources. Manuscripts and early modern works printed in Gothic font are generally not OCR-readable and thus require manual transcription. You can help us transcribing such sources soon.