Custom OCR Language Packs
How to create custom language packs for use in IronOCR?
Creating a custom language pack requires training a new Tesseract 4 LTSM language file / dictionary from a font.
There are many tutorials available online explaining the steps required to do this. The process is not simple, but it is thankfully quite well-documented.
As a good place to start, we suggest this YouTube tutorial from Gabriel Garcia (no affiliation) and their linked GitHub repository:
Once complete, the output will be a .traineddata file.
The .traineddata file can then be referenced in IronOCR as follows:
Doc: https://ironsoftware.com/csharp/ocr/languages/
using IronOcr; var Ocr = new IronTesseract(); Ocr.UseCustomTesseractLanguageFile("mydir/custom.traineddata"); //<---your new font // Multiple fonts can be used. using (var Input = new OcrInput(@"images\image.png")) { var Result = Ocr.Read(Input); Console.WriteLine(Result.Text); }