Important: The name of your image and ground truth file must match without the extension while preparing the dataset. Otherwise the trainer will throw an error.
@@SL7Tech Sure You are using make version: 4.4.1 combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus process_begin: CreateProcess(NULL, combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus, ...) failed. make (e=2): The system cannot find the file specified. make: *** [Makefile:207: data/deu_latf/engplus.lstm-unicharset] Error 2
I ran into this error"$ make training MODEL_NAME=kernsys START_MODEL=eng TESSDATA=../tessdata/ MAX_ITERATIONS=2000 LEARNING_RATE=0.001 You are using make version: 4.4.1 tesseract "data/kernsys-ground-truth/image_001.png" data/kernsys-ground-truth/image_001 --psm 13 lstm.train No box data found in 'data/kernsys-ground-truth/image_001.box'. Failed to read boxes from data/kernsys-ground-truth/image_001.png Error during processing. make: *** [Makefile:248: data/kernsys-ground-truth/image_001.lstmf] Error 1 "
Ran into same error. In my case it was an empty (zero bytes) file with .box extension which was apparently created during one of the previous failed attempts to run the command. After deleting the file it worked.
Important: The name of your image and ground truth file must match without the extension while preparing the dataset. Otherwise the trainer will throw an error.
excellent video, thank you
By far the best explanation of tesseract training.. 👌🏼
Thanks a lot bro. You are literally my savior for today. Thanks a bunch.
MOst of my data has two lines. What to do in that case?
can i use file png and box in data bro ?
If I need to train in Arabic numbers, can I do it in the same way? because there is no Arabic number dataset to download!!
@appsscope2487 you can create dataset yourself and yes follow this procedure for fine tuning. remember to pass language type as RTL.
I got combine_tessdata failed at 12:39 pls help
@@inkmaze can you share the log
@@SL7Tech Sure
You are using make version: 4.4.1
combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus
process_begin: CreateProcess(NULL, combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:207: data/deu_latf/engplus.lstm-unicharset] Error 2
@@SL7Tech Oh I forgot to add Tesseract to path LOL
Since pytesseract is terrible with alphanumeric words, can we train it with those kind of datasets
true, I've been trying for a long time to train for the Consolas alphanumeric font, but tesseract it's very inaccurate. HELP
I ran into this error"$ make training MODEL_NAME=kernsys START_MODEL=eng TESSDATA=../tessdata/ MAX_ITERATIONS=2000 LEARNING_RATE=0.001
You are using make version: 4.4.1
tesseract "data/kernsys-ground-truth/image_001.png" data/kernsys-ground-truth/image_001 --psm 13 lstm.train
No box data found in 'data/kernsys-ground-truth/image_001.box'.
Failed to read boxes from data/kernsys-ground-truth/image_001.png
Error during processing.
make: *** [Makefile:248: data/kernsys-ground-truth/image_001.lstmf] Error 1
"
make sure that ground truth file is not empty
@SL7Tech it is not empty
Ran into same error. In my case it was an empty (zero bytes) file with .box extension which was apparently created during one of the previous failed attempts to run the command. After deleting the file it worked.