GraphRAG: The Most Incredible RAG Strategy Revealed

How Do Hackers Crack ANY Software

Linus Torvalds: Speaks on Hype and the Future of AI

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

Live!🔴 สิงคโปร์ VS ทีมชาติไทย เชียร์สดฟุตบอลฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

How to Create Synthetic Dataset EASILY? Step by Step Tutorial

Mervin Praison

มุมมอง 5 372

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 27 ธ.ค. 2024

ความคิดเห็น •

@Menasaat 4 หลายเดือนก่อน ⁺⁷
This is awesome. For the next video, I have a suggestion: Suppose I have multiple PDF files containing a lot of information about my organization. How can I use a large language model (LLM) like the one you used above to create a dataset extracted from the knowledge provided in these PDFs?
@litttlemooncream5049 5 หลายเดือนก่อน ⁺¹
first time to know that LLMs can generate datasets! thank a lot
@ByteBop911 4 หลายเดือนก่อน ⁺²
i also generate synthetic datasets... a secret tip for alignment... set the mood and tone as parameters in the prompt as well to generate the questions as response (make the dataset a little more dynamic)
@kolasatheesh1719 4 หลายเดือนก่อน
hi @ByteBop i would like to create a synthetic dataset for images, do you know how to do it?
@CryptoMaN_Rahul 4 หลายเดือนก่อน ⁺¹
Hey bro can you tell does it cost anything?
Also can you share your codes ?? I'm a newbie want to learn
@maruc14 4 หลายเดือนก่อน
Really great tutorial. Keep em' coming. First time seeing NIMs demo.
@atultiwari88 4 หลายเดือนก่อน ⁺¹
Thank you for this awesome tutorial. I request you to kindly make a video on synthetic dataset generation from pdf files.
Thank you so much
@gr8tbigtreehugger 5 หลายเดือนก่อน
Super cool! Just did something extremely similar but in Google Sheets, so my non-tech peers can help.
@swetharavishankar4825 3 หลายเดือนก่อน
This is amazing.! Can you also explain how to create a classification model from the generated dataset ?
@batigol_9 2 หลายเดือนก่อน
if I have a specific number of subtopics and I don't want to generate new subtopics how would I chose the number of data sets to generate?
@MeinDeutschkurs 5 หลายเดือนก่อน
Great insight!
@chaithanyavamshi2898 5 หลายเดือนก่อน
Wow! Great Tutorial Mervin this is what I'm exactly looking for fine tuning. I tested it and it worked perfectly. I have a question, from my understanding from your blog and video, this dataset is suitable for ORPO fine-tuning (AI feedback scores)? Can I still use it for SFT by filtering out the responses (rows) with best scores?
@vitalis 5 หลายเดือนก่อน
Can we use ask LLMs to outline, create questions, reply, summerise BOOKS and use that to fine tune LLMs?
@swetharavishankar4825 3 หลายเดือนก่อน
Hey I am able to generate only 10 examples in count how do I make sure it generates more than a thousand at least
@kolasatheesh1719 4 หลายเดือนก่อน
hey buddy , can't we create the synthetic dataset for images? i mean uploading images and getting responses for the questions... how to do it
@fascinatingfactsabout 5 หลายเดือนก่อน
I'm still fuzzy about when is creating a synthetic data useful in a practical scenario for me, as a single person, not a large company that needs to fine-tune LLM's. Can someone clarify? What's the real world use for this?
@brishtiteveja 5 หลายเดือนก่อน ⁺⁵
LLM can hallucinate for questions you ask. Specially for low resource languages. For my own language Bengali, it hallucinates a lot and gives wrong answer for facts/events. Now, you can use RAG to stop hallucination. But, RAG depends on the size of the context. If I want to build a specialized model which knows and answers facts about Bengali culture and recent events, it’s useful if I can rather fine tune with the facts and recent event dataset, so that it becomes part of my model itself, therefore no hallucination. You can think of RAG as an open book exam, where you can search for answers in the book while taking the exam versus, a fine tuned model is you having the knowledge in your brain.. of course depending on your memory and reasoning ability, you will give accurate vs hallucinated answer. But if it becomes part of your memory accurately and you can retrieve it on demand, you now no longer have to check and search your books every time someone ask you a question. So, I hope that now you understand why it may be useful to finetune. The ultimate goal is “no hallucination” and therefore better accuracy.
@fascinatingfactsabout 5 หลายเดือนก่อน
@@brishtiteveja Tnx for the response, I really appreciate it. I believe you're talking about real world data, not generated by the AI. My question was focused on the usefulness of synthetic data though or am I interpreting synthetic data the wrong way?
@john_blues 4 หลายเดือนก่อน
@@brishtiteveja That's a great answer.
@bocilmillenium7698 5 หลายเดือนก่อน
Can it use for indonesian language?
@commoncats5437 5 หลายเดือนก่อน ⁺¹
Bro create a best model for tamil
We don’t have best gpu’s
If you do it we can createit for many usecases
@MervinPraison 5 หลายเดือนก่อน ⁺¹
ollama.com/mervinpraison
@commoncats5437 5 หลายเดือนก่อน
@@MervinPraison 🥰tnq
@john_blues 4 หลายเดือนก่อน ⁺¹
@@MervinPraison Pretty cool. Can you point me to how I can do something like this for another language? I am trying to help build one for the Yoruba language.
@TheBestgoku 4 หลายเดือนก่อน
Use claude to do this. No model in this world is even close to claude currently. Dont beleive the benchmarks. The difference is huge

ต่อไป

เล่นอัตโนมัติ

GraphRAG: The Most Incredible RAG Strategy Revealed

GraphRAG: The Most Incredible RAG Strategy Revealed

How Do Hackers Crack ANY Software

How Do Hackers Crack ANY Software

Linus Torvalds: Speaks on Hype and the Future of AI

Linus Torvalds: Speaks on Hype and the Future of AI

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

ถ้าทาสไม่ขุดทอง แล้วทาสจะขุดอะไร #hererm #เกม #gaming

Live!🔴 สิงคโปร์ VS ทีมชาติไทย เชียร์สดฟุตบอลฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

Live!🔴 สิงคโปร์ VS ทีมชาติไทย เชียร์สดฟุตบอลฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

ซินเดอเรลล่ากลายเป็นภรรยาของลุงสุดหล่อหลังจากคืนโรแมนติกนั้น ไม่รู้ว่าเธอได้พบกับมหาเศรษฐี

ซินเดอเรลล่ากลายเป็นภรรยาของลุงสุดหล่อหลังจากคืนโรแมนติกนั้น ไม่รู้ว่าเธอได้พบกับมหาเศรษฐี

How I’d learn ML in 2024 (if I could start over)

How I’d learn ML in 2024 (if I could start over)

EASILY Train Llama 3 and Upload to Ollama.com (Must Know)

EASILY Train Llama 3 and Upload to Ollama.com (Must Know)

GraphRAG Advanced: Avoid Overspending with These Tips

GraphRAG Advanced: Avoid Overspending with These Tips

LLAMA-3.1 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌

LLAMA-3.1 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌

EASIEST Way to Fine-Tune a LLM and Use It With Ollama

EASIEST Way to Fine-Tune a LLM and Use It With Ollama

"okay, but I want Llama 3 for my specific use case" - Here's how

"okay, but I want Llama 3 for my specific use case" - Here's how

A.I. ‐ Humanity's Final Invention?

A.I. ‐ Humanity's Final Invention?

AI Knowing My Entire Codebase Resulted in a 20x Productivity Increase

AI Knowing My Entire Codebase Resulted in a 20x Productivity Increase

23 AI Tools You Won't Believe are Free

23 AI Tools You Won't Believe are Free

BABYMONSTER - 'Love In My Heart' M/V

BABYMONSTER - 'Love In My Heart' M/V

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 16 : แมนเชสเตอร์ ซิตี้ พบ แมนเชสเตอร์ ยูไนเต็ด

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 16 : แมนเชสเตอร์ ซิตี้ พบ แมนเชสเตอร์ ยูไนเต็ด

หนูกับเต้ รัก ”พี่อู๋จูน“ นะ

หนูกับเต้ รัก ”พี่อู๋จูน“ นะ

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!

กินขนมมั้ยจ้ะน้อง หนมน้า😝

กินขนมมั้ยจ้ะน้อง หนมน้า😝

Highlight | อัจฉริยะสาวไส้...เบื้องลึกเหตุยิง "สจ.โต้งปราจีนบุรี" | เปิดโต๊ะข่าว | 17 ธ.ค.67

Highlight | อัจฉริยะสาวไส้...เบื้องลึกเหตุยิง "สจ.โต้งปราจีนบุรี" | เปิดโต๊ะข่าว | 17 ธ.ค.67

Bloxfruits player after Dragon update🐲| Doge Gaming

Bloxfruits player after Dragon update🐲| Doge Gaming