PaddleOCR Python Demo
ฝัง
- เผยแพร่เมื่อ 5 ก.ย. 2024
- In this video I demonstrate using a google collab notebook how Optical Character Recognition(OCR) can be done on images using PaddleOCR. PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
If you like such content please subscribe to the channel here:
www.youtube.co...
If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: www.buymeacoff...
Relevant Links:
github.com/Pad...
colab.research...
Thanks for sharing!
At present, we have used PaddleOCR in actual business. We found that PaddleOCR performs well in Chinese and English recognition, but it has poor effect on handwritten character recognition.
Yes handwritten text is challenging
@@RitheshSreenivasan yes, especially for Chinese handwritten recognition,
Thanks for the videos. Can you discuss on huggingface NER/ Top2vec and search engine possibilities using NER tags. Useful links will be helpful.
Ok
Nice work! PaddleOCR is a very convenient tools for multi OCR tasks
Thank You
Nice video. But there'a a small but major error when trying to debug. For the #draw ocr code, the result value we will be using should point to result[0] before storing the values for boxes, txts and scores!!
Thanks!
need to more about table structure recognition
Great work! Helps a lot!👍
Thank You
Sir what is craftocr how to use craftocr
I found PaddleOCR works great for document images, but less mature in street scenarios.
Ok. Good to know
Whenever I use the draw_ocr method, I get this " TypeError: '
I am not sure why you are getting this error. Looks to be an issue with how parameters are passed to this method. May be you can debug line by line
@@RitheshSreenivasan you need to user result[0] , there might be one more level into the results.
Even am getting same error
can you please share how to use PPstructure from paddlepaddle to detect trables and recognize the layout and everything. Thank you.
I have not worked on Paddle for more than a year now. Refer to their GitHub
Please give explanation of this following paper : Defect Prediction With Semantics and Context Features of Codes Based on Graph Representation Learning
Let me see if I can understand the paper
@@RitheshSreenivasan If you are interested we can work on this together.
does this work only on invoices? or will it work on other images like id cards as well? do i need to train it specifically for that?
you need to test and see
Error: Can not import paddle core while this file exists: /usr/local/lib/python3.10/dist-packages/paddle/fluid/libpaddle.so why is this error coming
Raise an error on their github. Something to do with how you have installed paddle
i am also geting same error i used your colab notebook exactly ame still same error
@@Abhishekkumar-wn9do This code was written a year ago. meanwhile there would be library changes might have happened
So Raise an error on their github. Something to do with how you have installed paddle
ok@@RitheshSreenivasan
I too have the same error lets update it hear if we have a solution to this
Hello,
Thank you for your work and explanations.
I would want to transform students’ handwritten paper copies into text files that could be processed with ms Word for example. What do you suggest to get the best recognition, please ?
GPT-4V has good ocr. You can try it with OpenAI API. On open source side Paddle OCR seems to be good
Thanks
Thank You!
Can you make a step by step. I really want to learn but do not know where you are installing. Is it in Command Prompt?
It is in a colab notebook
Thank you for sharing with us... :)
I have borderless table format(some columns may have data some not ) data in the PDF files, Can we convert extracted data into table format (rows and columns) using the paddle?
I have not worked with paddle in a long time. Do check their GitHub or raise an issue there
@@RitheshSreenivasan Thank you.
Hi Rithesh! Nice work, I have tried replicating the same steps on my end, but I get the following error when I run this line of code:
from PIL import Image, ImageDraw, ImageFont
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1][0] for line in result]
font = ImageFont.load_default()
im_show = draw_ocr(image, boxes, txts, scores,font_path='/usr/share/fonts/truetype/humor-sans/Humor-Sans.ttf' )
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Here goes the error:
385 box_num = len(boxes)
386 for i in range(box_num):
--> 387 if scores is not None and (scores[i] < drop_score or
388 math.isnan(scores[i])):
389 continue
TypeError: '
Looks like a bug in their code. You can open this file in your local installation and convert str to int
@@RitheshSreenivasan Thanks for the response. Figured out the issue, there'a a small error which can be major when trying to debug. For the #draw ocr code, the result value we will be using should point to result[0] before storing the values for boxes, txts and scores.
for line in result[0]:
print(line)
boxes = [line[0] for line in result[0]]
txts = [line[1][0] for line in result[0]]
scores = [line[1][1] for line in result[0]]
were we will get font path ,were should we download that font
Look at Linux system paths for font
Good work! I am after a OCR solution for extracting specific data from receipts. Have you developed a script for that or can PaddleOCR do it?
Check and see if you can use paddleocr
How could i now train a model to recognise one type of document more accurately?
It has been a long time since I made this video. Your best bet is paddleocr github. Check there or raise a query there
Hello, last week the notebook was running properly. Same notebook I runned now, gives me error as in 'Can not import paddle core while this file exists: /usr/local/lib/python3.10/dist-packages/paddle/fluid/libpaddle.so' Please help
Looks like some installation or library change issues. Try a fresh install of paddle. If it does not work raise a issue on paddle GitHub. The authors of the library are the best people for providing a solution
@@RitheshSreenivasan even in colab also it gives the same
@@raghavsharma4398 Contact the authors on Github
How to select the font path?
I looked up at the font paths in Linux and provided that font path. Google for the same
getting this error "AttributeError: module 'paddle.fluid.core_avx' has no attribute 'is_compiled_with_rocm'" on this line "ocr = PaddleOCR (use_angle_cls=True, lang='en')"
Put this error in their GitHub issues list. May be you will get a resolution
Why the paper says PP-OCR is "Ultra Lightweight"? What does that mean? Mean much faster than other existing OCR based methods? If so, how much faster?
Please refer to the paper and you will get your answers
@@RitheshSreenivasan just read the paper. I think ultra lightweight means it has small footprint.
Can you do a video for paddle lite for mobile?
I will checkout
Can PaddleOCR handle pdfs too? It's been impossible for me to make it work with them.
Please check their GitHub page
Can you make a video on post processing of ocr like extracting information from the text like extracting invoice number
Ok let me try
@@RitheshSreenivasan hey bro it really helped me thanks for the video. If possible can you suggest me a way for doing post processing ?
Depends on what information you want to extract. You can use heuristic rules or search terms and then location of bounding box to extract information. This will vary from use case to use case
@@RitheshSreenivasan yes as you said if we use condition wise comparision then we need to write logic for all the different pdfs. so i want to build model such that it recognizes for example date,invoice number irrespective of the layout
It is difficult in practice to write such a model. You can make use of some lookups
Hello. Thank you for the nice video. Unfortunately, I can not install google.colab, Is there an alternative for that? I have MAC M1
You need not install google collab. It is a web application from google
@@RitheshSreenivasan Thank you. I have another question. In my photos, I may have angled texts, vertical ones and ... How this model detect those cases?
You have to check for yourself
Hi Rithesh, do you know how to set the font_path in the im_show = draw_ocr line?
I have explained about the fonts in the video. I only know of that method
On Windows: C:/Windows/Fonts/Arial.ttf
Will Paddle OCR do well on text which have angled orientation?
You have to check for yourself. I have not tried it. Any OCR should have a skew correction package
Thanks alot. but I try to run this code on google colab and it gives me an error. can I ask you to help me?
---> 10 im_show = draw_ocr(image, boxes, txts, scores, font_path='BNazanin.ttf')
11 im_show = Image.fromarray(im_show)
12 im_show.save('result.jpg')
/usr/local/lib/python3.10/dist-packages/paddleocr/tools/infer/utility.py in draw_ocr(image, boxes, txts, scores, drop_score, font_path)
380 box_num = len(boxes)
381 for i in range(box_num):
--> 382 if scores is not None and (scores[i] < drop_score or
383 math.isnan(scores[i])):
384 continue
TypeError: '
Did you get any solution for this error?
which OCR engine is the best for handwriting recognition ?
I find paddle ocr to be good
@RitheshSreenivasan can paddle ocr recognize handwriting on whiteboad ?
we are building an OCR to convert the handwriting on the whiteboard into machine text, TrOCR is good, but it can't take large images, so what would you recommend ?
Try and see
Google colab session is crashing can i know the reason for this ?? Do i need to upgrade to colab pro for using this ?
No need . There could be some other issue . Is it a GPU instance?
@@RitheshSreenivasan yes
Just check for other issues
TypeError Traceback (most recent call last)
Cell In[43], line 15
13 image = Image.open(img_path).convert('RGB')
14 boxes = [line[0] for line in result]
---> 15 txts = [line[1][0] for line in result]
16 scores = [line[1][1] for line in result]
17 im_show = draw_ocr(image, boxes, txts, scores)
Cell In[43], line 15, in (.0)
13 image = Image.open(img_path).convert('RGB')
14 boxes = [line[0] for line in result]
---> 15 txts = [line[1][0] for line in result]
16 scores = [line[1][1] for line in result]
17 im_show = draw_ocr(image, boxes, txts, scores)
TypeError: 'float' object is not subscriptable
Hello, I am not able to resolve this error and have no clue why this keeps on happening. I am working on Jupyter Notebook. Can you please help?
Book a session on my Topmate link or refer to the PaddleOCR documentation