Graph RAG with Ollama - Save $$$ with Local LLMs

Prompt Engineering

มุมมอง 24 079

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 24 ธ.ค. 2024

ความคิดเห็น •

@programwithpradhan 5 หลายเดือนก่อน ⁺¹⁸
Definitely waiting for fully customisation of graph RAG using open source models
@JohnBoen 5 หลายเดือนก่อน ⁺¹⁰
Great discussion.
I am tackling things differently, and it seems to work pretty well.
I use ChatGPT to manually construct subject-predicate-target statements from a document. Instructions to infer root names from prepositional references, etc...
I feed this into a graph database.
User-entered text is passed to a local Llama3 instance to construct a graph query - this query is executed against the graph database.
This result set is added to the initial user text and passed to an appropriate LLM.
@_shikh4r_ 5 หลายเดือนก่อน
That's interesting
@engineerprompt 5 หลายเดือนก่อน ⁺¹
that's a really interesting approach. Would love to learn more if you can share.
@JohnBoen 5 หลายเดือนก่อน ⁺³
@engineerprompt
I am using this to create story context.
Convert a 3 paragraph randomly generated bio into a bunch of subject-predicate-target statements like Zahlar - loves - Dhardor city. Dhardor city - has - slums.
Generate a random paragraph about all targets. Add more subject-predicate-target statements where the target is now the subject.
Do this for 30 people.
I am having so much fun reading story ideas that I am not making progress on the code!
@mohammadchegini57 5 หลายเดือนก่อน
@@JohnBoen Are you planning to publish anything in GitHub?
@MeinDeutschkurs 5 หลายเดือนก่อน
Yeah! Prompting is key. Llama3 is very good at step by step instructions. Print this, write that, do this and combine 12:08 that, finally this. The good thing is that larger models are also able to understand this, but most of the time it does not work in the other direction.
@maxs6128 5 หลายเดือนก่อน ⁺⁷
ollama embeddings proxy on git hub. This script bridges the gap between OpenAI's embedding API and Ollama, making it compatible with the current version of Graphrag
@jasmeetsingh6121 5 หลายเดือนก่อน ⁺⁴
Hey, great video, A couple questions
1. Can I create the Graph using Llama3-70B and then use a different LLM (which doesn't have a rate limit) to answer RAG queries ?
2. Can I create a partial Graph, and update it as more data comes in (rather than create the Graph all over again) ?
@GoranSkular 3 หลายเดือนก่อน
1. yes you can.. you are just creating a graph with help of LLM.. you can make it manually if needed, without LLM
@maxs6128 5 หลายเดือนก่อน ⁺¹³
Hey! Cool video. I actually built a full local solution using Ollama, no need for LM Studio at all. Here's what I did: I created a proxy that translates between OpenAI API embeddings and Ollama's format, both ways.
The cool thing is, it works flawlessly for both global and local queries. I'd be happy to share the script with you if you're interested!
@athomas7347 5 หลายเดือนก่อน ⁺³
Would love to see this!
@maxs6128 5 หลายเดือนก่อน
ollama embeddings proxy on git hub
@engineerprompt 5 หลายเดือนก่อน ⁺⁴
Awesome. Would love to have a look at the code. Please do share.
@tofani-pintudo 5 หลายเดือนก่อน ⁺¹
If you're comfortable, why not open source it man!
@whlau6191 5 หลายเดือนก่อน
Thanks, could you share the GitHub link? Thank you very much for the effort!
@BrandonFoltz 5 หลายเดือนก่อน ⁺²
Yea I tried using llama-3 in LM Studio and using OpenAI embeddings since its cheap. GraphRAG detonated (after waiting an hour of course). It seems like it did all the LLM stuff OK and embeddings OK, but at the end when trying to put everything together it just went to crap. Too specific to using OpenAI for everything. Even using 4o it was more expensive than a six pack of beer and I ain't giving that up.
@engineerprompt 5 หลายเดือนก่อน ⁺¹
I agree. Hopefully the community will be able to modify the code to add real support for open weight models.
@jamlesscookie2466 4 หลายเดือนก่อน ⁺¹
On my PC, the RAG pipeline deployment fails with the following error:
❌ Errors occurred during the pipeline run, see logs for more details.
Logs:
....
....
File "C:\Python311\Lib\site-packages\pandas\core\indexers\utils.py", line 390, in check_key_length
raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key
Any idea why this happens?
@shameekm2146 5 หลายเดือนก่อน ⁺¹
For embedding model as well can't we use ollama? I see that mixedbread mxbai is available.
@engineerprompt 5 หลายเดือนก่อน ⁺¹
I haven't had much success with it yet.
@MrAhsan99 5 หลายเดือนก่อน
Does this RAG provides better results compared to semantic chunking?
@COC-ys5ir 2 หลายเดือนก่อน
Sir I am getting error like number of columns must be same as keys
pls help
@awakenwithoutcoffee 5 หลายเดือนก่อน
what about Lamini fine-tuning ? this might just be the best of both worlds. Would be really interesting to see comparisons between traditional RAG (optimized techniques), graphRAG and fine-tuning (lamini).
@meelanc1203 5 หลายเดือนก่อน
Thanks for the video, but are you, by chance, sharing the code modifications? I did not see any links.
@engineerprompt 5 หลายเดือนก่อน
The code is available when you create the graph rag repo. Watch the previous video for how to do that.
@user-wr4yl7tx3w 5 หลายเดือนก่อน
great content and analysis
@engineerprompt 5 หลายเดือนก่อน
thanks
@nguyenanhnguyen7658 5 หลายเดือนก่อน
May work with English-only and not even work with vLLM due to buggy parameters !
@xinzhang3502 5 หลายเดือนก่อน
So compared to GPTs, his search generation effect will be better?
@engineerprompt 5 หลายเดือนก่อน
that will depend on your hardware.
@brucewayne2480 5 หลายเดือนก่อน
Have you tried to use it with an existing neo4j graph ?
@engineerprompt 5 หลายเดือนก่อน
Not this specific project.
@Sceptic850 5 หลายเดือนก่อน
Could you use gemini Flash or Claude Haiku or DeepSpeed V2 to keep costs down??
@engineerprompt 5 หลายเดือนก่อน
That is possible as long as they use the same standard as openai api. For embeddings, its a very different story.
@takshitmathur2761 5 หลายเดือนก่อน
if my data size is larger(say 1000 pdfs) then the embedding cost will be too high, even if we use local models the time taken is too high right now. what do you think about using gemini pro model for doing that, as google charges no cost up to 300$ for AI projects. Maybe you can suggest your views and make a video on this in future?
@_shikh4r_ 5 หลายเดือนก่อน
Not sure about how accurate it would be. You will definitely have to craft your own prompts and experimenting with different prompts till you get a decent graph rag as a result would also cost you some money.
It would take cycles, how much time do you have?
@mrchongnoi 5 หลายเดือนก่อน ⁺²
Definitely have to use this with local models. I burned $46.00 in tokens. I waited 10 minutes before I could test the application. Your text files have to be 8-bit clean, or the app will blow up. Graph Rag is not a gift from microsoft.
@engineerprompt 5 หลายเดือนก่อน
I agree, its very expensive. Great a proof of concept but not yet ready for real use-cases. Local models is the way to go, if you can make it work properly.
@WeirdoPlays 5 หลายเดือนก่อน ⁺⁵
CrewAI vs AutoGen vs Langgraph
@engineerprompt 5 หลายเดือนก่อน ⁺⁴
or create your own agents :) working on a series on how to do that.
@awakenwithoutcoffee 5 หลายเดือนก่อน ⁺¹
LangGraph since it abstracts the minimum and is made to fill the gaps of LangChain.
@TranKiet-pj9mw 5 หลายเดือนก่อน
actually , when going with entity extraction , people can extract Noun and adj as a text fiile then make a cluster , using it in struct to send a request to bigger model such as chat GPT , the struct is well defined , accoding 5 topic of science so it always return a value you want to looking and decreasing the complexity . my structure is simple , define , formulate , it s shortcut of which ? ( looking for enpoint ) , middle point ( by tracking respond then sending again) ,.. the structure will depend on user . Let define "Noun" , is that Noun using for approximating a thing as a word ? . so when we now Noun and define for that word, we know the relevant topic around it , and because synonym is limited , so just respond over time , it would get the converge information , and luckily , we could use a little trick to get 1000 account for free . but my method is still limit when i dont define math formulate , and other language except english , some define still not unclear enough and fragment information only sense when take big sample .
your method is great too , why have to stick in one idea , the more would be better . thanks
@readmarketings9061 5 หลายเดือนก่อน ⁺¹
If Graphrag can’t match the quality of Diffbot while being cheaper, it’s currently not useful.
@engineerprompt 5 หลายเดือนก่อน ⁺¹
didn't know about it, will check it out.
@SethCohn23 5 หลายเดือนก่อน ⁺¹
chromadb FTW
@engineerprompt 5 หลายเดือนก่อน
:)
@lesptitsoiseaux 5 หลายเดือนก่อน
Doesn't work on PC. I'll try on Mac. I haven't found a video that actually makes it work on PC locally.
@Sri_Harsha_Electronics_Guthik 5 หลายเดือนก่อน
Much appreciated!

ต่อไป

เล่นอัตโนมัติ

Multi-modal RAG: Chat with Docs containing Images