I've just completed some experiments using Microsoft's GraphRAG and your description describes my results exactly, "garbage-in-garbage-out". Without a consistently solid knowledge graph there's not much an LLM can have knowledge of. Thanks for sharing your project. I'll take a look at that.
I did meet her and was able to talk to her personally! It was definitely great. Joining her session presented with Amy Hodler will make you realize you don't want to miss another one! 😊
KG are key for providing context to RAG. Still, I see the OWL/RDFs path outperforming LPG as it enables the user to explicitly define semantics and infer knowledge
I feel for most of the RAG user cases, vectorDB is good enough to retrieve information for LLM. But I agree that KB is better when you need the LLM to answer complex questions with precise and explainable answers.
Yes, to physically present yourself in multiple locations at the same time is quite challenging. My understanding is that it requires you to achieve presence on the fourth dimension. Once there, you can then enter multiple three-dimensional spaces at the same time. I wish I could do that, though I suspect it would be really disorienting at first! Best wishes! Also, I learn something new with every one of your videos! Thank you! I really like your approach!
I'm surprised and also thrilled that finally someone takes my not-so-funny joke in the video seriously! 😂 Love you concise and scientific explanation of the multi-dimensional space, which makes me dream more about having that superpower. 😉 Thanks for the encouragement once again. I'm learning a lot from you guys too and have been enjoying the journey with you all!
@@lckgllm Oh good. When you make it to the fourth dimension, please give a holler! :) It would be fun to see you in two places at the same time. Three even! XD. In the meantime, please keep us posted as to your coding progress. I find your videos really helpful. Thanks!
Thank you for amazing short video, I am eagerly waiting for you to make a video on how to convert csv data into knowledge graphs and answers questions on the csv files
Hello, thank you very much for posting the video, I am very interested in the part where you also show the graoh with in the chatnot, what python packahe is that please?(,y apologies if the question is redundant, I couldn't find it in other comments)
very cool! have been working on building a client-side profiling & Hyperthymesia second brain graph RAG kind of thing and really struggled with the bill with gpt graph construction! thanks!
Interesting post, Lenann. Keep it up! It would be interesting to explore from a procedural perspective how graphs could supplement vector databases in RAG doc retrieval and relevancy evaluation.
Hey Leann, first of all great explanation with some insights (specially on Diffbot). You got a new subscriber 👍 I'm going to work on a RAG based project which will use Neo4J as a Graph Database. I've went through other comments and your answers to them. But still wanted to know few things: 1. Here you took the example of speaker and what they have spoken (and their interest/expertise etc...) which is working fine. But what if I have some PDF docs of roughly 50-70 pages with some rules and regulations and want to use them as a custom knowledge base from my RAG project? Is knowledge graph database is good choice? why not simple Vector DB (such as milvus db)? 2. Assuming I must use Graph database, how do I efficiently chunk the PDFs and store into graph notes and relations? So that if users asks any query then user should get correct answer. 3. If the docs are related to rules and regulations, then what will be the nodes and relationships between them? Because here in your example, nodes were speaker, their expertise etc... I understand that you might not have perfect answer for all of these above but I'd like to have some point of view. Hope you find my comment and reply me once you get a time. Thanks for reading and your time.😊
This is exactly something I’m looking for as well. Were you able to figure out the answers and how it all came together ? Or are you still under the implementation process? I would truly truly appreciate a response from you. Thank you.
I think I didn't get my point across clearly and sorry about that. In terms of constructing a knowledge graph, from my experience, currently Diffbot's Natural Language API has the best performance regarding Named Entity Recognition (NER) and Relationship Extraction (RE) compared to GPT-4 (so far the best LLM) or spaCy-llm as I tried them both. Frankly speaking, large language models are not inherently optimized for tasks as entity/relationship extractions and we should think again whether LLMs are the best option for every single task.
Yes I have previously used spacy-llm in my last video:th-cam.com/video/mVNMrgexxoM/w-d-xo.html However, from the results generated by spacy-llm in my GitHub, you can see that there are still errors in the output, and I need to further pass the results to ChatGPT-4 for refining: github.com/leannchen86/openai-knowledge-graph-streamlit-app/blob/main/openaiKG.ipynb I hope future LLMs (regardless of closed source and open source) will enable us to see the confidence score for the output as I experienced with Diffbot's APIs.
Great video! However, I would completely replace DiffBot with an open source solution. There are many NER models, SpanMarkerNER to name one, since most of the entities you showed in the video are Person, Location, and Org, which libraries like SpaCy and setFit are pretty good for them. Using LLM with few shot learning would be another option. Overall, very nice video.
Thanks for the feedback! I have previously used spacy-llm in my last video:th-cam.com/video/mVNMrgexxoM/w-d-xo.html However, from the results generated by spacy-llm in my GitHub, you can see that there are still errors in the output even if examples are included in the prompts, and I needed to further the pass the results onto ChatGPT-4 for refinement: github.com/leannchen86/openai-knowledge-graph-streamlit-app/blob/main/openaiKG.ipynb I hope future LLMs (regardless of closed source and open source) will enable us to see the confidence score for the output as I experienced with Diffbot's APIs.
I'm interested in creating a Little Logical Model based upon the command structure of an application and then using agents take voice to text and text to cmd. maybe with a coresponding graph view updated with current information avaiable in another window on another display screen.
Good question @shingyanyuen3420 ! Sorry for not making it clear in the video, will improve my explanation next time. The typical RAG applications chunk documents into smaller parts and convert them into embeddings, which are lists of numeric values. LLM then retrieves information based on similarity to the semantic question. However, the information retrieval process can become challenging as document sizes increase, potentially causing the model to lose the overall context.. This is where knowledge graphs can be useful. Knowledge graphs explicitly define the relationships between entities, offering a more straightforward path for the LLM to find the answer while staying context-aware - improving accuracy for the retrieval process. Hopefully this article is helpful: ai.plainenglish.io/knowledge-graphs-achieve-superior-reasoning-versus-vector-search-for-retrieval-augmentation-ec0b37b12c49
I'm a newbie on these matters discussed here, but I really do appreciate the way your MyGraph RAG AI Assistant work, responding with text AND graph. Can you tell me a bit more on how you did accomplish this? (I'm especially interested in the graph that got generated!). Hope that's not a stupid question?
Definitely a great question! I didn't include the process in this video and plan to make another video about this, but let me show you the details via email :)
@@lckgllm Hi Leann, I had the same question. Isn't this just an implementation of streamlit-agraph? Is there any reason why you left this out of the GitHub repo you shared? It would be incredibly helpful/instructive to see the implementation.
Hi Leann. First off great video, I have been following your content for a while. I have some quick doubts: 1. What your opinion on using Knowledge Graph RAG for production level compared to vector search ? 2. I have tried different methods to extract entities and relationship from unstructured data, what I am looking for is leveraging LLM compatibilities to extract implicit and explicit entities and relationship from data so as to reduce manual efforts/errors. So far I have tried following methods: i) using rebel llm to extract entities - Not good for large set of data ii) directly using gpt 4- too much cost and lot of prompting iii) spacy-llm from your video- ok but when comes to large data still many wrong. What do you think would be the best and optimized approach here for a production application? We have 1000s of file and I am looking for a structured method which is cost efficient and effective in extracting entities and relationships from large unstructured data. Would love your opinion on this. Thank You
Great question! To be honest I'm not yet an expert on this subject, but I'll do my best providing a balanced view. For your 1st question, while it's a big one to answer, I'd say it heavily depends on what your data looks like and what the expectations for your RAG system are. What's the main problem your RAG app trying to solve? If your app needs to be highly context-aware and be able to draw/highlight the relationships between entities (e.g. find the shortest path between A & B), building a quality knowledge graph is essential for it to perform well. However, if scalability and speed are more of your focuses, vector search may need a higher priority. And it doesn't need to be "either/or". There's also examples where knowledge graphs and vector search are combined, which of course will be a more challenging task in terms of designing the roadmap. Speaking of production-level development, I'm curious if you already have your benchmarks and evaluations ready? Evals on LLM is a very active research area. At least from what I've seen (I can be wrong), how to define metrics and evaluate LLM's performance - there's still lots of unknowns. Unless you have your metrics ready to accurately measure the performance of your RAG system, I'd say it's still early to think about production-level issues. To your 2nd question, thanks for watching my previous video on spaacy-llm.😊 It's true that results from spacy-llm are not perfect yet as the model is still powered by LLMs such as GPT-4, which so far still sucks at identifying and labeling entities/relationships. That's why I tried Diffbot and actually found that the performance is better via their Natural Language API, even though it currently comes with a price for enterprise tho. If the organization you work at has the budget, I think it's worth trying it. *Note: this video is not sponsored by Diffbot so I'm not trying to talk you into buying their product. I purely share my experience and process to build this project (see in the description). I hope this information is helpful to you!
@@Manu-m8w6m Even I'm looking for something similar. So far found `Universal-NER/UniNER-7B-all` which is good in identifying entities although it can get only one entity at a down. and `Tostino/Inkbot-13B-8k-0.2` seems promising although I haven't tried it out yet. Can you share if you found out a better way to extract entities and relationships from unstructured data.
Thank you so much for this incredible tutorial! I've discovered that "GenAI" is my newfound passion, and I hadn't even heard of the term until I watched your video. I look forward to your next video.
You are an excellent presenter. Thank you. We do however need to find you a better background music. It's giving pharmaceutical commercial and the levels are a little too high over your voice. Still great though. You have excellent stage presence and a clear voice.
Totally agree with you :) I have since upgraded to epidemic sounds for music and be more mindful that the music volume should not distract the viewer when I'm speaking. I'm trying to learn and become better after every video, so really appreciate seeing feedback like this for improvement!
Thanks for the awesome video! I was trying to reproduce your code but got an error because the "text_for_kg()" function was not defined. Any chance you can help me understand where this functions comes from? Great content and great editing! Thank you
Hello! Sorry for the late reply, been busy with work. I just realized that text_for_kg() somehow was deleted from the notebook, but it should be the same thing as diffbot_nlp.nlp_request(). I just updated the notebook in the girhub repo. Let me know if it doesn't work. I'll do my best to fix it. Thanks for point this issue out! @souzajvp
6:26 It's not a great answer. 🙁 The graph DB has effectively acted as a bottleneck for the data. I.e. The answer is based purely on nodes + edges. I'd be curious whether the graph DB could essentially act as an index for the original content. I.e. Still use a graph query to return the relevant nodes/edges, but pass the source text corresponding to them as a RAG response.
Good question, although I'd defend that the answer at 6:26 is good enough for my use case😂, as my purpose was converting unstructured text data into structured knowledge graphs, which served as the ground truth for LLM to find out the answer. 6:26 exactly showed the context from my knowledge graph. I think what you're asking "whether the graph DB could essentially act as an index for the original content" is a different use case, where the documents themselves are classified as nodes and edges would be appended to the nodes based on their semantic similarities. I'd probably make another video particularly for this use case, which is different from what you saw in this current video.
@@lckgllm not the entire document, but rather the section of it that corresponds to the creation of that node/edge. E.g. *Graph response:* [Amy] interested_in [science history] *Source text:* "Amy has a love for science history and a fascination for complexity studies" If it was possible to store the source text as an attribute of each relationship and return that rather than the edge names then you'd probably get a higher quality answer.
Ohh!! I like this idea, yes it would be more concrete and reliable. Thanks for sharing! Let me try to improve this feature and may make a video about it ;) Really appreciate your feedback, thanks of much ❤@@AndrewNeeson-vp7ti
Thanks for sharing your experience! As you referring to standard RAG (purely vector-based) or graph-based? Purely vector-based RAG is not great, while graph-based RAG could return more reliable results. But I also have to be honest that prototypes are generally cute and are very far from production-ready. That's why we need a lot of testing/evaluations, and I'm currently gearing towards making videos containing production-oriented testing :) Here's a video that I did some testing: th-cam.com/video/mHREErgLmi0/w-d-xo.html
@@lckgllm Use small open source encoder models for NER e.g fine-tuned versions of ALBERT. Simply import the right model from the huggingface transformers library and let it label your dataset. They often perform better than LLMs on the tasks that they are trained for and most of them even work with colab free tier :)
Excellent video! Thanks Leann, I had no idea about Diffbot, i'll be checking that out for sure.
Best of luck on your GenAI Journey
I've just completed some experiments using Microsoft's GraphRAG and your description describes my results exactly, "garbage-in-garbage-out". Without a consistently solid knowledge graph there's not much an LLM can have knowledge of. Thanks for sharing your project. I'll take a look at that.
It's nice to see my old co-worker Michelle randomly popping up in a video. I hope you were able to meet her. She is great!
I did meet her and was able to talk to her personally! It was definitely great. Joining her session presented with Amy Hodler will make you realize you don't want to miss another one! 😊
KG are key for providing context to RAG. Still, I see the OWL/RDFs path outperforming LPG as it enables the user to explicitly define semantics and infer knowledge
It can be done by hand, but automatisation of this human feature is impressing. Good video!
This is very insightful Leann.. cheers from South Africa
Thanks for the encouragement! 😊
I feel for most of the RAG user cases, vectorDB is good enough to retrieve information for LLM. But I agree that KB is better when you need the LLM to answer complex questions with precise and explainable answers.
Use a VectorDB when 'good enough' is acceptable. Give your VectorDB a brain by combining it with a KG if you need it to be accurate, timely, or safe
Yes, to physically present yourself in multiple locations at the same time is quite challenging. My understanding is that it requires you to achieve presence on the fourth dimension. Once there, you can then enter multiple three-dimensional spaces at the same time. I wish I could do that, though I suspect it would be really disorienting at first! Best wishes!
Also, I learn something new with every one of your videos! Thank you! I really like your approach!
I'm surprised and also thrilled that finally someone takes my not-so-funny joke in the video seriously! 😂 Love you concise and scientific explanation of the multi-dimensional space, which makes me dream more about having that superpower. 😉 Thanks for the encouragement once again. I'm learning a lot from you guys too and have been enjoying the journey with you all!
@@lckgllm Oh good. When you make it to the fourth dimension, please give a holler! :) It would be fun to see you in two places at the same time. Three even! XD.
In the meantime, please keep us posted as to your coding progress. I find your videos really helpful. Thanks!
Excellent video, clear explanation, please do post more in the gen ai and knowledge graph space
Thank you for amazing short video, I am eagerly waiting for you to make a video on how to convert csv data into knowledge graphs and answers questions on the csv files
amazing video - hope to see more. this was very informational and inspirational to learn about Knowledge Graphs
Great! Waiting for more of your videos!
Hello, thank you very much for posting the video, I am very interested in the part where you also show the graoh with in the chatnot, what python packahe is that please?(,y apologies if the question is redundant, I couldn't find it in other comments)
very cool! have been working on building a client-side profiling & Hyperthymesia second brain graph RAG kind of thing and really struggled with the bill with gpt graph construction! thanks!
Sounds like a cool project! Let's chat 🙂
Is there any technique to evaluate the knowledge graph quality. As there were some incorrect entities
Thanks Leann. I'm going to have to give it a try for deeper dive with DiffBot.
Can you please share the code for the application you built to visualize the knowledge graph ?
This was an interesting video. I was more focused on the process, and thinking behind using this process to organize and visualize data.
Interesting post, Lenann. Keep it up! It would be interesting to explore from a procedural perspective how graphs could supplement vector databases in RAG doc retrieval and relevancy evaluation.
Thank you for your feedback! That gives me an awesome video idea. See you in the next one 😊
You betcha. Subscribed!@@lckgllm
How to return neo4j subgraph image when stremlit's response?
Hey Leann, first of all great explanation with some insights (specially on Diffbot). You got a new subscriber 👍
I'm going to work on a RAG based project which will use Neo4J as a Graph Database.
I've went through other comments and your answers to them. But still wanted to know few things:
1. Here you took the example of speaker and what they have spoken (and their interest/expertise etc...) which is working fine. But what if I have some PDF docs of roughly 50-70 pages with some rules and regulations and want to use them as a custom knowledge base from my RAG project? Is knowledge graph database is good choice? why not simple Vector DB (such as milvus db)?
2. Assuming I must use Graph database, how do I efficiently chunk the PDFs and store into graph notes and relations? So that if users asks any query then user should get correct answer.
3. If the docs are related to rules and regulations, then what will be the nodes and relationships between them? Because here in your example, nodes were speaker, their expertise etc...
I understand that you might not have perfect answer for all of these above but I'd like to have some point of view. Hope you find my comment and reply me once you get a time. Thanks for reading and your time.😊
This is exactly something I’m looking for as well. Were you able to figure out the answers and how it all came together ? Or are you still under the implementation process? I would truly truly appreciate a response from you. Thank you.
Are you creating embeddings on top of the knowledge graph for RAG??
Have you tried using a local LLM such as Mistral? It'll take a big longer, but it's considerably cheaper.
I think I didn't get my point across clearly and sorry about that. In terms of constructing a knowledge graph, from my experience, currently Diffbot's Natural Language API has the best performance regarding Named Entity Recognition (NER) and Relationship Extraction (RE) compared to GPT-4 (so far the best LLM) or spaCy-llm as I tried them both. Frankly speaking, large language models are not inherently optimized for tasks as entity/relationship extractions and we should think again whether LLMs are the best option for every single task.
@@lckgllm Everything's a nail if you only wield an LLM =D
🤣@@SlykeThePhoxenix
Hi, since you mentioned pricing, especially how expensive GPT-4 is, is using the Diffbot API free? Thanks.
Diffbot sets a pretty high bar for entering this project, any thought/plan to utilise open source project instead? Thanks!
Yes I have previously used spacy-llm in my last video:th-cam.com/video/mVNMrgexxoM/w-d-xo.html
However, from the results generated by spacy-llm in my GitHub, you can see that there are still errors in the output, and I need to further pass the results to ChatGPT-4 for refining: github.com/leannchen86/openai-knowledge-graph-streamlit-app/blob/main/openaiKG.ipynb
I hope future LLMs (regardless of closed source and open source) will enable us to see the confidence score for the output as I experienced with Diffbot's APIs.
thank you@@lckgllm ! I will have a look @ the video and the notebook. Might come back for discussion again. have a good one!
Great video! However, I would completely replace DiffBot with an open source solution. There are many NER models, SpanMarkerNER to name one, since most of the entities you showed in the video are Person, Location, and Org, which libraries like SpaCy and setFit are pretty good for them. Using LLM with few shot learning would be another option. Overall, very nice video.
Thanks for the feedback! I have previously used spacy-llm in my last video:th-cam.com/video/mVNMrgexxoM/w-d-xo.html
However, from the results generated by spacy-llm in my GitHub, you can see that there are still errors in the output even if examples are included in the prompts, and I needed to further the pass the results onto ChatGPT-4 for refinement: github.com/leannchen86/openai-knowledge-graph-streamlit-app/blob/main/openaiKG.ipynb
I hope future LLMs (regardless of closed source and open source) will enable us to see the confidence score for the output as I experienced with Diffbot's APIs.
@@lckgllm If you'd like to have confidence score using llms, a simple hack is, add that into the prompt, so llm returns the result with scores. :)
Can you make a video on how to generate knowledge graphs for pdf books like DSM 5
I guess you will continue post KG content in diffbot only in the future?
I'm interested in creating a Little Logical Model based upon the command structure of an application and then using agents take voice to text and text to cmd. maybe with a coresponding graph view updated with current information avaiable in another window on another display screen.
I don't understand how knowledge graphs are being used in RAG? What's the differences between a KG-RAG and a normal RAG?
Good question @shingyanyuen3420 ! Sorry for not making it clear in the video, will improve my explanation next time.
The typical RAG applications chunk documents into smaller parts and convert them into embeddings, which are lists of numeric values. LLM then retrieves information based on similarity to the semantic question.
However, the information retrieval process can become challenging as document sizes increase, potentially causing the model to lose the overall context.. This is where knowledge graphs can be useful. Knowledge graphs explicitly define the relationships between entities, offering a more straightforward path for the LLM to find the answer while staying context-aware - improving accuracy for the retrieval process.
Hopefully this article is helpful:
ai.plainenglish.io/knowledge-graphs-achieve-superior-reasoning-versus-vector-search-for-retrieval-augmentation-ec0b37b12c49
Can RAGs become efficient enough to do data analysis over text tables and csvs? I'm planning to build one so wanted to know if this is possible.
Yeah I think so! That's a great idea for a new video :)
@@lckgllm yes. I would be glad to collaborate on such project.
very insighful, would love to see more such videos.
I'm a newbie on these matters discussed here, but I really do appreciate the way your MyGraph RAG AI Assistant work, responding with text AND graph. Can you tell me a bit more on how you did accomplish this? (I'm especially interested in the graph that got generated!). Hope that's not a stupid question?
Definitely a great question! I didn't include the process in this video and plan to make another video about this, but let me show you the details via email :)
@lm Would be highly appreciated! 🙏 Didn't get it yet though...
@@lckgllm Hi Leann, I had the same question. Isn't this just an implementation of streamlit-agraph? Is there any reason why you left this out of the GitHub repo you shared? It would be incredibly helpful/instructive to see the implementation.
Hi Leann. First off great video, I have been following your content for a while. I have some quick doubts:
1. What your opinion on using Knowledge Graph RAG for production level compared to vector search ?
2. I have tried different methods to extract entities and relationship from unstructured data, what I am looking for is leveraging LLM compatibilities to extract implicit and explicit entities and relationship from data so as to reduce manual efforts/errors. So far I have tried following methods:
i) using rebel llm to extract entities - Not good for large set of data
ii) directly using gpt 4- too much cost and lot of prompting
iii) spacy-llm from your video- ok but when comes to large data still many wrong.
What do you think would be the best and optimized approach here for a production application? We have 1000s of file and I am looking for a structured method which is cost efficient and effective in extracting entities and relationships from large unstructured data. Would love your opinion on this.
Thank You
good question
Great question! To be honest I'm not yet an expert on this subject, but I'll do my best providing a balanced view. For your 1st question, while it's a big one to answer, I'd say it heavily depends on what your data looks like and what the expectations for your RAG system are. What's the main problem your RAG app trying to solve?
If your app needs to be highly context-aware and be able to draw/highlight the relationships between entities (e.g. find the shortest path between A & B), building a quality knowledge graph is essential for it to perform well. However, if scalability and speed are more of your focuses, vector search may need a higher priority. And it doesn't need to be "either/or". There's also examples where knowledge graphs and vector search are combined, which of course will be a more challenging task in terms of designing the roadmap.
Speaking of production-level development, I'm curious if you already have your benchmarks and evaluations ready? Evals on LLM is a very active research area. At least from what I've seen (I can be wrong), how to define metrics and evaluate LLM's performance - there's still lots of unknowns. Unless you have your metrics ready to accurately measure the performance of your RAG system, I'd say it's still early to think about production-level issues.
To your 2nd question, thanks for watching my previous video on spaacy-llm.😊 It's true that results from spacy-llm are not perfect yet as the model is still powered by LLMs such as GPT-4, which so far still sucks at identifying and labeling entities/relationships. That's why I tried Diffbot and actually found that the performance is better via their Natural Language API, even though it currently comes with a price for enterprise tho. If the organization you work at has the budget, I think it's worth trying it. *Note: this video is not sponsored by Diffbot so I'm not trying to talk you into buying their product. I purely share my experience and process to build this project (see in the description).
I hope this information is helpful to you!
@@lckgllm
Hi Leann. First of all thank you for the reply.
Is it ok If i reply in linkden directly to you?
@lcgdsllm Hi Leann, I have messaged you in linkden😀
@@Manu-m8w6m Even I'm looking for something similar.
So far found `Universal-NER/UniNER-7B-all` which is good in identifying entities although it can get only one entity at a down.
and `Tostino/Inkbot-13B-8k-0.2` seems promising although I haven't tried it out yet.
Can you share if you found out a better way to extract entities and relationships from unstructured data.
Very comprehensive. I am follow you 643
Thank you ♥
Thank you so much for this incredible tutorial! I've discovered that "GenAI" is my newfound passion, and I hadn't even heard of the term until I watched your video. I look forward to your next video.
Thank you so much for the encouragement! I’ll continue working hard to bring better content. Stay passionate with GenAI 💪🏻❤️
You are an excellent presenter. Thank you. We do however need to find you a better background music. It's giving pharmaceutical commercial and the levels are a little too high over your voice. Still great though. You have excellent stage presence and a clear voice.
Totally agree with you :) I have since upgraded to epidemic sounds for music and be more mindful that the music volume should not distract the viewer when I'm speaking. I'm trying to learn and become better after every video, so really appreciate seeing feedback like this for improvement!
Love this!
Thanks for the awesome video!
I was trying to reproduce your code but got an error because the "text_for_kg()" function was not defined. Any chance you can help me understand where this functions comes from?
Great content and great editing!
Thank you
Same problem for me. Trying to implement text_for_kg.
Hello! Sorry for the late reply, been busy with work. I just realized that text_for_kg() somehow was deleted from the notebook, but it should be the same thing as diffbot_nlp.nlp_request(). I just updated the notebook in the girhub repo. Let me know if it doesn't work. I'll do my best to fix it. Thanks for point this issue out! @souzajvp
nice sharing
Thanks.
Great video!
Excellent tutorial! 👍
Very good video! Thanks!
Thanks Can i Ask a question " What will bé first and second step to make médical chat bot using llm as with m'y style and persona"" thanks
This is big question to answer, but what's important and fundamental is having a high-quality dataset. Do you already have your domain-specific data?
@@lckgllm thanks very much
Hi Leann, a wonderful video i was looking for something like this, how can I reach out to you?
Thank you! You can find me through either LinkedIn: Leann Chen.
Or email:
leannchen86@gmail.com
Look forward to connecting soon!
6:26 It's not a great answer. 🙁 The graph DB has effectively acted as a bottleneck for the data. I.e. The answer is based purely on nodes + edges.
I'd be curious whether the graph DB could essentially act as an index for the original content.
I.e. Still use a graph query to return the relevant nodes/edges, but pass the source text corresponding to them as a RAG response.
Good question, although I'd defend that the answer at 6:26 is good enough for my use case😂, as my purpose was converting unstructured text data into structured knowledge graphs, which served as the ground truth for LLM to find out the answer. 6:26 exactly showed the context from my knowledge graph.
I think what you're asking "whether the graph DB could essentially act as an index for the original content" is a different use case, where the documents themselves are classified as nodes and edges would be appended to the nodes based on their semantic similarities. I'd probably make another video particularly for this use case, which is different from what you saw in this current video.
@@lckgllm not the entire document, but rather the section of it that corresponds to the creation of that node/edge.
E.g.
*Graph response:* [Amy] interested_in [science history]
*Source text:* "Amy has a love for science history and a fascination for complexity studies"
If it was possible to store the source text as an attribute of each relationship and return that rather than the edge names then you'd probably get a higher quality answer.
Ohh!! I like this idea, yes it would be more concrete and reliable. Thanks for sharing! Let me try to improve this feature and may make a video about it ;) Really appreciate your feedback, thanks of much ❤@@AndrewNeeson-vp7ti
why do you switch rooms?
because I was traveling among different locations, and this video actually took me like 5 days to film lol
Subbed for Mr. Beast counting 💗
Just did not mention the price of diffbot.
Yep because it’s not a sponsored video (see in the description). It’s a tutorial video about Graph RAG application.
go to the presentation by Amy Hodler (and tell her I said hello)
I have tested for few critical document to get some answer using standard RAG and to be honest, didnt enjoy the performance so much.
Thanks for sharing your experience! As you referring to standard RAG (purely vector-based) or graph-based? Purely vector-based RAG is not great, while graph-based RAG could return more reliable results. But I also have to be honest that prototypes are generally cute and are very far from production-ready. That's why we need a lot of testing/evaluations, and I'm currently gearing towards making videos containing production-oriented testing :)
Here's a video that I did some testing: th-cam.com/video/mHREErgLmi0/w-d-xo.html
太棒了。
已訂閱追蹤
great video like that but plese stop using Large Language models, for NER there are way better performing and cheaper alternatives :)
hey daniel, what are the better performing and cheaper alternatives for NER?
Good question! I want to know too :) @danielneu9136
@@lckgllm Use small open source encoder models for NER e.g fine-tuned versions of ALBERT. Simply import the right model from the huggingface transformers library and let it label your dataset. They often perform better than LLMs on the tasks that they are trained for and most of them even work with colab free tier :)
Content was good but found face filters visually distracting.
What face filters? I literally talked in front of my MacBook Pro 14. I did have makeup on which I admit.
@@lckgllm you're fine do not worry about it. however, the audio sounds like it has a low bitrate.
Definitely going get a mic for better voice quality. Thanks for the feedback, folks!@@Armoredcody
Cool! New sub from me 😊
Simply using OpenAI tools is not interesting
Ur cute
not down with the closed source stuff for local llm, but cool info regardless
Thanks for the feedback and you actually just gave me a great idea on future videos!
Very nice content! support support 🇹🇼