I have massive ADHD and for some reason your minimal chill communication style is easy to listen to and your visuals are super helpful. Thanks for putting these together!
Hi! I just wanted to say a huge thank you for the incredible work you’re doing and the knowledge you’re sharing. Your videos are full of inspiring content and valuable information that really help with personal development. I appreciate your effort and dedication. Thanks a lot, and I can’t wait for more great content!
Just wanted to say again, great content, mate. As a self-taught/teaching AI engineer/programmer/content creator, your content is an incredible resource and inspiration. Keep it coming!
This makes so much sense... I designed a small workflow (no agents involved) to parse some tabular text data and do some reasoning on each row... I used llama3 8B. It worked ok but every few rows the response would not return in the correct format. Sometimes one of the main headers would come back with a typo. The solution I found was to catch the errors and when they occurred re-run the function. Not ideal of course but did the trick as it was a small job... now I understand it may just be that these smaller models are just not reliable when you need to work with structured responses...
If we are using local llms like ollama then can we debug in local server suppose I want to know the tokenization and detokenization inside api ,how can we see that.Thanks for the lecture
I love your implementation. I’ll modify my pull request to use your Ollama implementation and resubmit for the SearXNG feature. I’ll try and follow your style to select between SearXNG and Serper.
Great content and presentation, thank you. I would really like to see a workflow that uses local models to generate components of the output and then any one of the non-local models to synthesize the final output, a neo4j knowledge graph for shared memory between agents would be an amazing next step.
Thanks for doing these! This is EXACTLY what I needed EXACTLY at the right time in my learning process. Suggestion for next tutorial: how to get two different models talking to each other and run python scripts as tools :). Love your work.
Amazing tutorial. I just tested your app using my RTX4090 and Llama3.1:8b. The results were impressive and latency was OK considering its running locally. I also tried with Llama3.1:70b and it worked great but too slow running locally. Llama3.1 looks like a game changer for local LLM apps.
i try to run React agent tools calling using llama3:instruct with langchain and llama-Index and lamaIndex was able to call any local functions seemlessly without any formating issues.But langchain failed becuase the agent was not able to convert the parameters to int rather than string i used on a multplication function and a vectore db restriver case . only issue with llama_index is it doesnt have langgraph 🤣
Can you make ReAct agents from scratch without using langchain with Json output (like in the case with create_structed_agent()). because when am using the std. way of langchain is giving me parsing errors. Plz
Hi, for me - so far, the best Ollama's structured output model was `codestral` (22b, and - if matters, has a non-commercial licence). I agree, we are not there yet with those SLMs. Maybe later, nov-dec this year.
Hi, Thanks for you videos. im testing and testing ollama in tools like aider etc , even my own python apps , just finding it a struggle always (not integration and setup) results are so bad ... with aider for example..always messes up writing the files and choosing wrong folders all that .. even with tools like agent-zero all that makes you go mad. then you switch to sonnet or gpt 4....ALL runs like magic. this local models ..if i can just get it to work well.. in research and testing i end up using soooo much bucks
Honestly this was a little over my head and I didn't fully grasp everything you said. I've only been programming for a total of 3 months and then Python for less than a month, as a beginner programmer what are your thoughts about just trying to write Python scripts and using complex Python logic to try to pass responses and prompts between endpoints for ollama? Like I said I'm a beginner so maybe I'm missing something do I need to have LangChain involved to just mess around like that?
Your solurion to the perfect agent is Gemini flash due to cost speed and quality and of course the large context, try that next and watch it do what you want
Great vid, your explanations and the experience you bring are in a different league. About using LangGraph with open source, I am thinking of using litelLM proxy to simplify the building of different models. Of course that limits open source models to those provided by litellm. Any thoughts on this approach? anyone?
If you still struggling like i do as a non coder is upload the entire video and code to Gemini 1.5 pro and ask it what you want, like how integrate to openrouter and it will do everything explain easier and update code
great content. is it possible to download the LLM model locally eg from HuggingFace and then incorporate to your script to run without calling to Ollama.
It is possible to do this. Hugging Face has it's own interface for inference. You could just create your own Hugging Face module similarly to how I showed with Ollama. Although you wouldn't be sending POST requests, you would just be running the model with the hugging face API.
yes your explanation is good but, it is difficult to understand and if you explain it along with writing code, so instead of writing all the code and explain it later, explain it step by step by writing. It would be helpful for even beginners to interact
Bro, you're tutorials are GOLD. You are literally the only one I've seen on the platform breaking it down like this. You fking ROCK 🤘
100%
+1
I have massive ADHD and for some reason your minimal chill communication style is easy to listen to and your visuals are super helpful. Thanks for putting these together!
The way you handle software engineering principle is absolutely amazing!
Hi! I just wanted to say a huge thank you for the incredible work you’re doing and the knowledge you’re sharing. Your videos are full of inspiring content and valuable information that really help with personal development. I appreciate your effort and dedication. Thanks a lot, and I can’t wait for more great content!
you method is leaps and bounds better than most. I enjoy your tutorials very much
Just wanted to say again, great content, mate. As a self-taught/teaching AI engineer/programmer/content creator, your content is an incredible resource and inspiration. Keep it coming!
Fantastic and inspiring. At the end of your video you also answered a question I had regarding smaller LLM and hardware restrictions.
How is this approach different than rather using ChatOllama instance of langchain, doesn't that handle everything on the backend?
Awesome, I would like to see how your unique approach works with incorporate an ollama embedding model + vector store.
This makes so much sense... I designed a small workflow (no agents involved) to parse some tabular text data and do some reasoning on each row... I used llama3 8B. It worked ok but every few rows the response would not return in the correct format. Sometimes one of the main headers would come back with a typo. The solution I found was to catch the errors and when they occurred re-run the function. Not ideal of course but did the trick as it was a small job... now I understand it may just be that these smaller models are just not reliable when you need to work with structured responses...
If we are using local llms like ollama then can we debug in local server suppose I want to know the tokenization and detokenization inside api ,how can we see that.Thanks for the lecture
I love your implementation. I’ll modify my pull request to use your Ollama implementation and resubmit for the SearXNG feature. I’ll try and follow your style to select between SearXNG and Serper.
Great content and presentation, thank you. I would really like to see a workflow that uses local models to generate components of the output and then any one of the non-local models to synthesize the final output, a neo4j knowledge graph for shared memory between agents would be an amazing next step.
Thanks for doing these! This is EXACTLY what I needed EXACTLY at the right time in my learning process. Suggestion for next tutorial: how to get two different models talking to each other and run python scripts as tools :). Love your work.
Thanks for the suggestion and thanks for watching. Glad it has been helpful.
I really like content you produce man! keep it up! cheerss
Amazing tutorial. I just tested your app using my RTX4090 and Llama3.1:8b. The results were impressive and latency was OK considering its running locally. I also tried with Llama3.1:70b and it worked great but too slow running locally. Llama3.1 looks like a game changer for local LLM apps.
i try to run React agent tools calling using llama3:instruct with langchain and llama-Index
and lamaIndex was able to call any local functions seemlessly without any formating issues.But langchain failed becuase the agent was not able to convert the parameters to int rather than string
i used on a multplication function and a vectore db restriver case .
only issue with llama_index is it doesnt have langgraph 🤣
Hey, just curious: why not use the langchain wrappers of serper api and ollama api?
Can you make ReAct agents from scratch without using langchain with Json output (like in the case with create_structed_agent()). because when am using the std. way of langchain is giving me parsing errors. Plz
As usual great tutorial , would also love if you can create similar tutorials on CrewAI as well ,thanks.
Hey brother ! Possible using Groq ? 👀
👀
Could you test LLama 3 instruct on python coding
Hi, for me - so far, the best Ollama's structured output model was `codestral` (22b, and - if matters, has a non-commercial licence). I agree, we are not there yet with those SLMs. Maybe later, nov-dec this year.
cant we use it from langchain directly ?
I think you use Langgraph for more complex Agent architecture🤔
Only AI channel that actually is helpful.
Hi, Thanks for you videos.
im testing and testing ollama in tools like aider etc , even my own python apps , just finding it a struggle always (not integration and setup)
results are so bad ...
with aider for example..always messes up writing the files and choosing wrong folders all that ..
even with tools like agent-zero all that makes you go mad.
then you switch to sonnet or gpt 4....ALL runs like magic.
this local models ..if i can just get it to work well..
in research and testing i end up using soooo much bucks
Honestly this was a little over my head and I didn't fully grasp everything you said.
I've only been programming for a total of 3 months and then Python for less than a month, as a beginner programmer what are your thoughts about just trying to write Python scripts and using complex Python logic to try to pass responses and prompts between endpoints for ollama?
Like I said I'm a beginner so maybe I'm missing something do I need to have LangChain involved to just mess around like that?
Super cool video
Your solurion to the perfect agent is Gemini flash due to cost speed and quality and of course the large context, try that next and watch it do what you want
Great content!
Hats down my friend 👏🙏🎩 we would like you to dedicate a video on using crewai within langgraph ❤
Great vid, your explanations and the experience you bring are in a different league. About using LangGraph with open source, I am thinking of using litelLM proxy to simplify the building of different models. Of course that limits open source models to those provided by litellm. Any thoughts on this approach? anyone?
Great stuff
tbhanks, well explained.
I would love to see a groq model, is very easy to use their API.
Thank you so much.
Thank you 😊
Why don't you use lightning studio? Thanks
If you still struggling like i do as a non coder is upload the entire video and code to Gemini 1.5 pro and ask it what you want, like how integrate to openrouter and it will do everything explain easier and update code
yay!
great content. is it possible to download the LLM model locally eg from HuggingFace and then incorporate to your script to run without calling to Ollama.
It is possible to do this. Hugging Face has it's own interface for inference. You could just create your own Hugging Face module similarly to how I showed with Ollama. Although you wouldn't be sending POST requests, you would just be running the model with the hugging face API.
yes your explanation is good but, it is difficult to understand and if you explain it along with writing code, so instead of writing all the code and explain it later, explain it step by step by writing. It would be helpful for even beginners to interact
Thanks for the feedback. This is something that takes a lot of skill and time to execute well. I'll consider it for future videos.
I think he is using his clone to make this video