Graph RAG: Improving RAG with Knowledge Graphs

Florence 2 Fine-Tuning: How to Train a Vision Language Model?

Florence-2 : Advancing a Unified Representation for a Variety of Vision Tasks | Paper Explained

มวยมันส์วันศุกร์ 19/07/2024

ไฮไลท์ฟุตบอลชิงแชมป์อาเซียนรุ่นอายุไม่เกิน 19 ปี 2024 | ทีมชาติไทย พบ ทีมชาติสิงคโปร์

How to Measure ANY Cliffs Height with a Rock

OCR Using Microsoft's Florence-2 Vision Model on Free Google Colab

TheAILearner

มุมมอง 3 067

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 25 มิ.ย. 2024
In this video, I demonstrate how to implement Microsoft's recently released Florence-2 novel Foundational Vision Model on a free Google Colab workspace using a T4 GPU. I use Optical Character Recognition (OCR) as the primary use case to showcase the model's capabilities.
You'll learn:
1. An introduction to the Florence-2 Vision Model
2. Loading and configuring the Florence-2
3. Implementing OCR task with this advanced model
4. Evaluating the performance and results of OCR using Florence-2 Vision Model.
Code Link - colab.research.google.com/dri...
Florence-2 Model - huggingface.co/microsoft/Flor...
#florence2 #vision #multimodal #multimodalai #llm #microsoftai #googlecolab #ocr #machinelearning #ai #tutorial #freeresources #attention #objectdetection #segmentation
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 15

@vishalranjan2429 4 วันที่ผ่านมา ⁺¹
i want to intergate this in an android app , how to do it ?
@jinanlionbridge4521 17 วันที่ผ่านมา
Thanks for sharing! very useful
@Steven_249 12 วันที่ผ่านมา
wow... you are super smart..... especially when you change the code for OCR REGION....! Amazing !!!
@theailearner1857 12 วันที่ผ่านมา
Glad it helped!
@kushaldulani 21 ชั่วโมงที่ผ่านมา
Yes really, No one does that on TH-cam, rest of all teach only basics. Thanks bro
@despo13 23 วันที่ผ่านมา
Thanks
@sudabadri7051 14 วันที่ผ่านมา
Good video
@seanthibert5961 14 วันที่ผ่านมา
Any luck with making use of the raw OCR results? I find it picks up more than the ocr_with_region
@trinityblood5622 14 วันที่ผ่านมา ⁺¹
Any luck on Finetuning the OCR part with custom dataset other than English?
@theailearner1857 13 วันที่ผ่านมา
Haven't tried yet, but will try to make a video on finetuning.
@ai_enthusiastic_ 22 วันที่ผ่านมา ⁺¹
How much RAM does it need to run on a CPU?
@theailearner1857 22 วันที่ผ่านมา ⁺¹
In full precision, it would need approximately 10-11 GB of RAM for inference. If you are not able run it on CPU, you can try with quantized model.
@NimeshV-nf6uz 23 วันที่ผ่านมา ⁺¹
Can I run this on cpu ?
@theailearner1857 23 วันที่ผ่านมา ⁺²
Yes you can. Change the "device_map" argument to "cpu". And also make sure to not move input tensors to "cuda".
@NimeshV-nf6uz 22 วันที่ผ่านมา
@@theailearner1857 thanks 🤜🤛

ต่อไป

เล่นอัตโนมัติ

Graph RAG: Improving RAG with Knowledge Graphs

Graph RAG: Improving RAG with Knowledge Graphs

Florence 2 Fine-Tuning: How to Train a Vision Language Model?

Florence 2 Fine-Tuning: How to Train a Vision Language Model?

Florence-2 : Advancing a Unified Representation for a Variety of Vision Tasks | Paper Explained

Florence-2 : Advancing a Unified Representation for a Variety of Vision Tasks | Paper Explained

มวยมันส์วันศุกร์ 19/07/2024

มวยมันส์วันศุกร์ 19/07/2024

ไฮไลท์ฟุตบอลชิงแชมป์อาเซียนรุ่นอายุไม่เกิน 19 ปี 2024 | ทีมชาติไทย พบ ทีมชาติสิงคโปร์

ไฮไลท์ฟุตบอลชิงแชมป์อาเซียนรุ่นอายุไม่เกิน 19 ปี 2024 | ทีมชาติไทย พบ ทีมชาติสิงคโปร์

How to Measure ANY Cliffs Height with a Rock

How to Measure ANY Cliffs Height with a Rock

ป่วยแล้วไง เก่งกว่าแล้วกัน #valorant #shorts

ป่วยแล้วไง เก่งกว่าแล้วกัน #valorant #shorts

Scientific Concepts You're Taught in School Which are Actually Wrong

Scientific Concepts You're Taught in School Which are Actually Wrong

OCR Using Microsoft's Phi-3 Vision Model on Free Google Colab

OCR Using Microsoft's Phi-3 Vision Model on Free Google Colab

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

How .NET Aspire will save .NET (and its not about "the cloud")

How .NET Aspire will save .NET (and its not about "the cloud")

Google Data Center Security: 6 Layers Deep

Google Data Center Security: 6 Layers Deep

Master different vision tasks with pre-trained Florence-2 | Community Q&A (Jul 3)

Master different vision tasks with pre-trained Florence-2 | Community Q&A (Jul 3)

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Why Fine Tuning is Dead w/Emmanuel Ameisen

Why Fine Tuning is Dead w/Emmanuel Ameisen

How I'd Learn AI in 2024 (if I could start over)

How I'd Learn AI in 2024 (if I could start over)

รีวิว Samsung Galaxy Z Fold6 และ Galaxy Z Flip6 เมื่อ AI อยู่ในกระเป๋าคุณ

รีวิว Samsung Galaxy Z Fold6 และ Galaxy Z Flip6 เมื่อ AI อยู่ในกระเป๋าคุณ

รีวิว Galaxy A55 VS Galaxy S24 ตัวกลางงัดตัวท็อป สูงแค่ไหนก็ไปถึง

รีวิว Galaxy A55 VS Galaxy S24 ตัวกลางงัดตัวท็อป สูงแค่ไหนก็ไปถึง

1 สัปดาห์หลังใช้ Galaxy Z Flip6 - เทียบ Flip5 และ Flip4 [กล้อง เกม แบต AI]

1 สัปดาห์หลังใช้ Galaxy Z Flip6 - เทียบ Flip5 และ Flip4 [กล้อง เกม แบต AI]

Battery low 🔋 🪫

Battery low 🔋 🪫

New setup part 3: There's still a lot to add #setup #gamer #gameroom #techhouse #gamingtech

New setup part 3: There's still a lot to add #setup #gamer #gameroom #techhouse #gamingtech

Klavye İle Trafik Işığını Yönetmek #shorts

Klavye İle Trafik Işığını Yönetmek #shorts

แท็บเล็ตตามสั่ง

แท็บเล็ตตามสั่ง

Introducing Galaxy Ring | Samsung

Introducing Galaxy Ring | Samsung