PaddleOCR Python Demo

Rithesh Sreenivasan

มุมมอง 26 489

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ก.ย. 2024
In this video I demonstrate using a google collab notebook how Optical Character Recognition(OCR) can be done on images using PaddleOCR. PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
If you like such content please subscribe to the channel here:
www.youtube.co...
If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: www.buymeacoff...
Relevant Links:
github.com/Pad...
colab.research...

ความคิดเห็น • 97

@user-lz7ym5qs6w 2 ปีที่แล้ว ⁺³
Thanks for sharing!
At present, we have used PaddleOCR in actual business. We found that PaddleOCR performs well in Chinese and English recognition, but it has poor effect on handwritten character recognition.
@RitheshSreenivasan 2 ปีที่แล้ว
Yes handwritten text is challenging
@kaitaojiang5839 2 ปีที่แล้ว
@@RitheshSreenivasan yes, especially for Chinese handwritten recognition,
@venkatesanr9455 2 ปีที่แล้ว ⁺³
Thanks for the videos. Can you discuss on huggingface NER/ Top2vec and search engine possibilities using NER tags. Useful links will be helpful.
@RitheshSreenivasan 2 ปีที่แล้ว
Ok
@kaitaojiang5839 2 ปีที่แล้ว
Nice work! PaddleOCR is a very convenient tools for multi OCR tasks
@RitheshSreenivasan 2 ปีที่แล้ว
Thank You
@bharathbarakam5702 4 หลายเดือนก่อน ⁺²
Nice video. But there'a a small but major error when trying to debug. For the #draw ocr code, the result value we will be using should point to result[0] before storing the values for boxes, txts and scores!!
@elunicoodiseo 4 หลายเดือนก่อน
Thanks!
@vimalaug15 2 หลายเดือนก่อน
need to more about table structure recognition
@littletomatomonkeysmeeeeel8324 2 ปีที่แล้ว ⁺²
Great work! Helps a lot!👍
@RitheshSreenivasan 2 ปีที่แล้ว
Thank You
@YTHITTLEROFFICIAL 25 วันที่ผ่านมา
Sir what is craftocr how to use craftocr
@littletomatomonkeysmeeeeel8324 2 ปีที่แล้ว
I found PaddleOCR works great for document images, but less mature in street scenarios.
@RitheshSreenivasan 2 ปีที่แล้ว
Ok. Good to know
@Jim-hn8hd ปีที่แล้ว ⁺¹
Whenever I use the draw_ocr method, I get this " TypeError: '
@RitheshSreenivasan ปีที่แล้ว ⁺¹
I am not sure why you are getting this error. Looks to be an issue with how parameters are passed to this method. May be you can debug line by line
@mayurdangar3804 ปีที่แล้ว
@@RitheshSreenivasan you need to user result[0] , there might be one more level into the results.
@vidhyashree5359 11 หลายเดือนก่อน
Even am getting same error
@nomuchohan 10 หลายเดือนก่อน
can you please share how to use PPstructure from paddlepaddle to detect trables and recognize the layout and everything. Thank you.
@RitheshSreenivasan 9 หลายเดือนก่อน
I have not worked on Paddle for more than a year now. Refer to their GitHub
@souvickdas5564 2 ปีที่แล้ว
Please give explanation of this following paper : Defect Prediction With Semantics and Context Features of Codes Based on Graph Representation Learning
@RitheshSreenivasan 2 ปีที่แล้ว
Let me see if I can understand the paper
@souvickdas5564 2 ปีที่แล้ว
@@RitheshSreenivasan If you are interested we can work on this together.
@rashmikasaha2874 5 หลายเดือนก่อน
does this work only on invoices? or will it work on other images like id cards as well? do i need to train it specifically for that?
@RitheshSreenivasan 5 หลายเดือนก่อน
you need to test and see
@abhishekg4147 ปีที่แล้ว ⁺¹
Error: Can not import paddle core while this file exists: /usr/local/lib/python3.10/dist-packages/paddle/fluid/libpaddle.so why is this error coming
@RitheshSreenivasan ปีที่แล้ว
Raise an error on their github. Something to do with how you have installed paddle
@Abhishekkumar-wn9do ปีที่แล้ว
i am also geting same error i used your colab notebook exactly ame still same error
@RitheshSreenivasan ปีที่แล้ว ⁺¹
@@Abhishekkumar-wn9do This code was written a year ago. meanwhile there would be library changes might have happened
So Raise an error on their github. Something to do with how you have installed paddle
@Abhishekkumar-wn9do ปีที่แล้ว
ok@@RitheshSreenivasan
@mohammedmuzammilkhan3043 ปีที่แล้ว
I too have the same error lets update it hear if we have a solution to this
@urbandancesquad2 9 หลายเดือนก่อน
Hello,
Thank you for your work and explanations.
I would want to transform students’ handwritten paper copies into text files that could be processed with ms Word for example. What do you suggest to get the best recognition, please ?
@RitheshSreenivasan 9 หลายเดือนก่อน
GPT-4V has good ocr. You can try it with OpenAI API. On open source side Paddle OCR seems to be good
@masoudparpanchi505 ปีที่แล้ว ⁺¹
Thanks
@RitheshSreenivasan ปีที่แล้ว
Thank You!
@LezZeppelinFanPage-nm1ly 4 หลายเดือนก่อน
Can you make a step by step. I really want to learn but do not know where you are installing. Is it in Command Prompt?
@RitheshSreenivasan 4 หลายเดือนก่อน
It is in a colab notebook
@ram_rahim_creations_officials 10 หลายเดือนก่อน
Thank you for sharing with us... :)
I have borderless table format(some columns may have data some not ) data in the PDF files, Can we convert extracted data into table format (rows and columns) using the paddle?
@RitheshSreenivasan 10 หลายเดือนก่อน
I have not worked with paddle in a long time. Do check their GitHub or raise an issue there
@ram_rahim_creations_officials 10 หลายเดือนก่อน ⁺¹
@@RitheshSreenivasan Thank you.
@bharathbarakam5702 4 หลายเดือนก่อน
Hi Rithesh! Nice work, I have tried replicating the same steps on my end, but I get the following error when I run this line of code:
from PIL import Image, ImageDraw, ImageFont
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1][0] for line in result]
font = ImageFont.load_default()
im_show = draw_ocr(image, boxes, txts, scores,font_path='/usr/share/fonts/truetype/humor-sans/Humor-Sans.ttf' )
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Here goes the error:
385 box_num = len(boxes)
386 for i in range(box_num):
--> 387 if scores is not None and (scores[i] < drop_score or
388 math.isnan(scores[i])):
389 continue
TypeError: '
@RitheshSreenivasan 4 หลายเดือนก่อน
Looks like a bug in their code. You can open this file in your local installation and convert str to int
@bharathbarakam5702 4 หลายเดือนก่อน ⁺¹
@@RitheshSreenivasan Thanks for the response. Figured out the issue, there'a a small error which can be major when trying to debug. For the #draw ocr code, the result value we will be using should point to result[0] before storing the values for boxes, txts and scores.
@Que_me_miras6 7 วันที่ผ่านมา
for line in result[0]:
print(line)
@Que_me_miras6 7 วันที่ผ่านมา
boxes = [line[0] for line in result[0]]
txts = [line[1][0] for line in result[0]]
scores = [line[1][1] for line in result[0]]
@shreyasmagajikondi7838 5 หลายเดือนก่อน
were we will get font path ,were should we download that font
@RitheshSreenivasan 5 หลายเดือนก่อน
Look at Linux system paths for font
@ramiyengar1 ปีที่แล้ว
Good work! I am after a OCR solution for extracting specific data from receipts. Have you developed a script for that or can PaddleOCR do it?
@RitheshSreenivasan ปีที่แล้ว
Check and see if you can use paddleocr
@rishisharath6668 ปีที่แล้ว
How could i now train a model to recognise one type of document more accurately?
@RitheshSreenivasan ปีที่แล้ว
It has been a long time since I made this video. Your best bet is paddleocr github. Check there or raise a query there
@user-mm7gj4gs5g ปีที่แล้ว
Hello, last week the notebook was running properly. Same notebook I runned now, gives me error as in 'Can not import paddle core while this file exists: /usr/local/lib/python3.10/dist-packages/paddle/fluid/libpaddle.so' Please help
@RitheshSreenivasan ปีที่แล้ว
Looks like some installation or library change issues. Try a fresh install of paddle. If it does not work raise a issue on paddle GitHub. The authors of the library are the best people for providing a solution
@raghavsharma4398 11 หลายเดือนก่อน
@@RitheshSreenivasan even in colab also it gives the same
@RitheshSreenivasan 11 หลายเดือนก่อน
@@raghavsharma4398 Contact the authors on Github
@kartikkatoch2097 5 หลายเดือนก่อน
How to select the font path?
@RitheshSreenivasan 5 หลายเดือนก่อน
I looked up at the font paths in Linux and provided that font path. Google for the same
@DivyaSharma-oq7up ปีที่แล้ว
getting this error "AttributeError: module 'paddle.fluid.core_avx' has no attribute 'is_compiled_with_rocm'" on this line "ocr = PaddleOCR (use_angle_cls=True, lang='en')"
@RitheshSreenivasan ปีที่แล้ว
Put this error in their GitHub issues list. May be you will get a resolution
@incameet ปีที่แล้ว
Why the paper says PP-OCR is "Ultra Lightweight"? What does that mean? Mean much faster than other existing OCR based methods? If so, how much faster?
@RitheshSreenivasan ปีที่แล้ว
Please refer to the paper and you will get your answers
@incameet ปีที่แล้ว ⁺¹
@@RitheshSreenivasan just read the paper. I think ultra lightweight means it has small footprint.
@senthilkumarnadarajan2247 ปีที่แล้ว ⁺¹
Can you do a video for paddle lite for mobile?
@RitheshSreenivasan ปีที่แล้ว ⁺¹
I will checkout
@carolinardc96 ปีที่แล้ว
Can PaddleOCR handle pdfs too? It's been impossible for me to make it work with them.
@RitheshSreenivasan ปีที่แล้ว
Please check their GitHub page
@harshavardhanachyuta2055 ปีที่แล้ว
Can you make a video on post processing of ocr like extracting information from the text like extracting invoice number
@RitheshSreenivasan ปีที่แล้ว ⁺¹
Ok let me try
@harshavardhanachyuta2055 ปีที่แล้ว
@@RitheshSreenivasan hey bro it really helped me thanks for the video. If possible can you suggest me a way for doing post processing ?
@RitheshSreenivasan ปีที่แล้ว ⁺¹
Depends on what information you want to extract. You can use heuristic rules or search terms and then location of bounding box to extract information. This will vary from use case to use case
@harshavardhanachyuta2055 ปีที่แล้ว
@@RitheshSreenivasan yes as you said if we use condition wise comparision then we need to write logic for all the different pdfs. so i want to build model such that it recognizes for example date,invoice number irrespective of the layout
@RitheshSreenivasan ปีที่แล้ว
It is difficult in practice to write such a model. You can make use of some lookups
@narijami ปีที่แล้ว
Hello. Thank you for the nice video. Unfortunately, I can not install google.colab, Is there an alternative for that? I have MAC M1
@RitheshSreenivasan ปีที่แล้ว
You need not install google collab. It is a web application from google
@narijami ปีที่แล้ว
@@RitheshSreenivasan Thank you. I have another question. In my photos, I may have angled texts, vertical ones and ... How this model detect those cases?
@RitheshSreenivasan ปีที่แล้ว
You have to check for yourself
@shadabsheikh3859 ปีที่แล้ว
Hi Rithesh, do you know how to set the font_path in the im_show = draw_ocr line?
@RitheshSreenivasan ปีที่แล้ว
I have explained about the fonts in the video. I only know of that method
@renantrevisan2406 4 หลายเดือนก่อน
On Windows: C:/Windows/Fonts/Arial.ttf
@harshalpal8564 ปีที่แล้ว
Will Paddle OCR do well on text which have angled orientation?
@RitheshSreenivasan ปีที่แล้ว
You have to check for yourself. I have not tried it. Any OCR should have a skew correction package
@user-el9fx6gz2s ปีที่แล้ว
Thanks alot. but I try to run this code on google colab and it gives me an error. can I ask you to help me?
---> 10 im_show = draw_ocr(image, boxes, txts, scores, font_path='BNazanin.ttf')
11 im_show = Image.fromarray(im_show)
12 im_show.save('result.jpg')
/usr/local/lib/python3.10/dist-packages/paddleocr/tools/infer/utility.py in draw_ocr(image, boxes, txts, scores, drop_score, font_path)
380 box_num = len(boxes)
381 for i in range(box_num):
--> 382 if scores is not None and (scores[i] < drop_score or
383 math.isnan(scores[i])):
384 continue
TypeError: '
@samantsagar3845 8 หลายเดือนก่อน
Did you get any solution for this error?
@shadabdulsamad9205 6 หลายเดือนก่อน
which OCR engine is the best for handwriting recognition ?
@RitheshSreenivasan 6 หลายเดือนก่อน
I find paddle ocr to be good
@shadabdulsamad9205 6 หลายเดือนก่อน
@RitheshSreenivasan can paddle ocr recognize handwriting on whiteboad ?
we are building an OCR to convert the handwriting on the whiteboard into machine text, TrOCR is good, but it can't take large images, so what would you recommend ?
@RitheshSreenivasan 6 หลายเดือนก่อน
Try and see
@harshavardhanachyuta2055 ปีที่แล้ว
Google colab session is crashing can i know the reason for this ?? Do i need to upgrade to colab pro for using this ?
@RitheshSreenivasan ปีที่แล้ว
No need . There could be some other issue . Is it a GPU instance?
@harshavardhanachyuta2055 ปีที่แล้ว
@@RitheshSreenivasan yes
@RitheshSreenivasan ปีที่แล้ว
Just check for other issues
@shubhmehta4035 ปีที่แล้ว
TypeError Traceback (most recent call last)
Cell In[43], line 15
13 image = Image.open(img_path).convert('RGB')
14 boxes = [line[0] for line in result]
---> 15 txts = [line[1][0] for line in result]
16 scores = [line[1][1] for line in result]
17 im_show = draw_ocr(image, boxes, txts, scores)
Cell In[43], line 15, in (.0)
13 image = Image.open(img_path).convert('RGB')
14 boxes = [line[0] for line in result]
---> 15 txts = [line[1][0] for line in result]
16 scores = [line[1][1] for line in result]
17 im_show = draw_ocr(image, boxes, txts, scores)
TypeError: 'float' object is not subscriptable
Hello, I am not able to resolve this error and have no clue why this keeps on happening. I am working on Jupyter Notebook. Can you please help?
@RitheshSreenivasan ปีที่แล้ว
Book a session on my Topmate link or refer to the PaddleOCR documentation

ต่อไป

เล่นอัตโนมัติ

Meta AI Nougat | Neural Optical Understanding OCR for Academic Documents | Scientific Documents