itext/itext-pdfocr-dotnet
pdfOCR is an iText add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving
GitHub repository with 49 stars and 17 forks.
Language: C#
Topics: archival, character, data, diacritic, extractable, glyphs, hindi, image, iso-compliant, ligatures