Elvis Saravia
Elvis Saravia
  • 66
  • 273 705
[LLMS News] AGI Predictions, Mamba-2, NLLB, GPT-4 Features, Structured LLM Generation, KLING
Another exciting episode of LLM News!
Links mentioned in the video:
00:00 AGI Predictions - x.com/dwarkesh_sp/status/1798024306573848851
02:10 KLING - kling.kuaishou.com/#expression-body-view
05:12 Extracting GPT-4 Features - openai.com/index/extracting-concepts-from-gpt-4/
10:13 Mistral AI Finetuning - mistral.ai/news/customization/
12:15 NLLB - www.nature.com/articles/s41586-024-07335-x
14:14 Mamba-2 - arxiv.org/abs/2405.21060
16:12 MatMul-free LLMs - arxiv.org/abs/2406.02528
18:15 SaySelf - arxiv.org/abs/2405.20974
20:31 FineWeb-Edu - huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1
22:59 Structured Generation - x.com/dottxtai/status/1798443290913853770
24:58 Nomic-Embed-Vision - x.com/nomic_ai/status/1798368463292973361
26:17 Tool Usage Course - github.com/anthropics/courses/tree/master/ToolUse
26:51 Cohere Cookbooks - docs.cohere.com/page/cookbooks
28:03 AI Agents in LangGraph - www.deeplearning.ai/short-courses/ai-agents-in-langgraph/
#ai #machinelearning #science #engineering
มุมมอง: 516

วีดีโอ

New prompting method uses thought templates | Buffer of Thoughts
มุมมอง 4812 ชั่วโมงที่ผ่านมา
This new paper proposes a new prompting method that leverages thought templates to enhance the accuracy and efficiency of LLM-based reasoning. Paper: arxiv.org/abs/2406.04271 #ai #machinelearning #llms #promptengineering
Has prompt engineering been solved?
มุมมอง 30312 ชั่วโมงที่ผ่านมา
Overview of the new prompt engineering feature by Anthropic. Announcement here: x.com/omarsar0/status/1789041930896028007 #promptengineering #ai #machinelearning
Extracting features from Claude 3 Sonnet
มุมมอง 18119 ชั่วโมงที่ผ่านมา
A short summary of insights and takeaways from this exciting new paper on extracting interpretable features from Claude 3 Sonnet. Paper: transformer-circuits.pub/2024/scaling-monosemanticity/index.html #ai #machinelearning #science #llms
[LLM News] xAI Series B, Codestral, LLM Guide, AutoGen Course, Symbolic Chain-of-Thought
มุมมอง 885วันที่ผ่านมา
Welcome to another exciting episode of LLM News! Links mentioned in the video: 00:00 xAI Series B - x.ai/blog/series-b 02:04 Anthropic New Paper - transformer-circuits.pub/2024/scaling-monosemanticity/index.html 05:53 Codestral - mistral.ai/news/codestral/ 08:16 LC-Boost - arxiv.org/abs/2405.15318 12:05 LLM Guide - www.oreilly.com/radar/what-we-learned-from-a-year-of-building-with-llms-part-i/ ...
Exploring Capabilities of Long-Context LLMs
มุมมอง 37014 วันที่ผ่านมา
# Experimenting with Long-Context LLMs Long-context LLMs are incredibly useful and their flexibility provides a great interface for deeper exploration of capabilities. Here are some more thoughts on long-context LLMs and how I use them. #ai #machinelearning #engineering
[LLM News] GPT4-o, Project Astra, Veo, Copilot+ PCs, Gemini 1.5 Flash, Chameleon
มุมมอง 73814 วันที่ผ่านมา
Another exciting episode of LLM News! Links mentioned in the video: 00:00 Copilot PCs - blogs.windows.com/devices/2024/05/20/introducing-the-ultimate-copilot-pcs-the-all-new-surface-pro-and-surface-laptop/ 03:22 Veo - deepmind.google/technologies/veo/ 06:18 Project Astra - deepmind.google/technologies/gemini/project-astra/ 08:42 Gemini 1.5 Flash - deepmind.google/technologies/gemini/flash/ 10:0...
[LLM NEWS] KANs, Gemma 10M Context, OpenAI Updates?, Automatic Prompt Engineering, Tokenizer Arena
มุมมอง 66728 วันที่ผ่านมา
The Top AI and LLMs news. Links mentioned in the video: 00:00 OpenAI Updates? - OpenAI/status/1788987793613725786 02:13 Automatic Prompt Engineering - AnthropicAI/status/1788958483565732213 08:05 Consistency LLMs - omarsar0/status/1788594039865958762 10:00 Tokenizer Arena - huggingface.co/spaces/Cognitive-Lab/Tokenizer_Arena 11:55 Gemma 10M Context Window - t...
[LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
มุมมอง 1.4Kหลายเดือนก่อน
[LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
SWE-Agent | An LLM-based Software Engineering Agent
มุมมอง 932หลายเดือนก่อน
SWE-Agent | An LLM-based Software Engineering Agent
Better and Faster LLMs via Multi-token Prediction
มุมมอง 2Kหลายเดือนก่อน
Better and Faster LLMs via Multi-token Prediction
Training an LLM to effectively use information retrieval
มุมมอง 903หลายเดือนก่อน
Training an LLM to effectively use information retrieval
Microsoft introduces Phi-3 | The most capable small language model?
มุมมอง 2.9Kหลายเดือนก่อน
Microsoft introduces Phi-3 | The most capable small language model?
Llama 3 is here! | First impressions and thoughts
มุมมอง 1.3Kหลายเดือนก่อน
Llama 3 is here! | First impressions and thoughts
Understanding LLM Settings
มุมมอง 10Kหลายเดือนก่อน
Understanding LLM Settings
Zero-shot Prompting Explained
มุมมอง 8Kหลายเดือนก่อน
Zero-shot Prompting Explained
Basic Prompt Examples for LLMs
มุมมอง 2.7Kหลายเดือนก่อน
Basic Prompt Examples for LLMs
General Tips for Designing Prompts
มุมมอง 4.9Kหลายเดือนก่อน
General Tips for Designing Prompts
Elements of a Prompt
มุมมอง 5Kหลายเดือนก่อน
Elements of a Prompt
Getting Started with the OpenAI Playground
มุมมอง 5K2 หลายเดือนก่อน
Getting Started with the OpenAI Playground
Prompt Engineering Overview
มุมมอง 183Kปีที่แล้ว
Prompt Engineering Overview
How to save the world and forward your career in 5 easy steps | Women in NLP Talks
มุมมอง 2292 ปีที่แล้ว
How to save the world and forward your career in 5 easy steps | Women in NLP Talks
(Hopefully-Reusable) Life Lessons for PhD Students in NLP
มุมมอง 1.5K3 ปีที่แล้ว
(Hopefully-Reusable) Life Lessons for PhD Students in NLP
Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch | Session 5
มุมมอง 8703 ปีที่แล้ว
Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch | Session 5
Keep Learning ML #3 | Contrastively Trained Structured World Models
มุมมอง 2963 ปีที่แล้ว
Keep Learning ML #3 | Contrastively Trained Structured World Models
Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
มุมมอง 1.6K3 ปีที่แล้ว
Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
มุมมอง 2283 ปีที่แล้ว
Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
Keep Learning ML (Session 1) | DSV, CompLex, Modern tools for emotions
มุมมอง 5623 ปีที่แล้ว
Keep Learning ML (Session 1) | DSV, CompLex, Modern tools for emotions
Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
มุมมอง 2133 ปีที่แล้ว
Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
Question Understanding: COVID-Q: 1,600+ Questions about COVID-19
มุมมอง 1453 ปีที่แล้ว
Question Understanding: COVID-Q: 1,600 Questions about COVID-19

ความคิดเห็น

  • @JP-io8cr
    @JP-io8cr 4 วันที่ผ่านมา

    How can I access this playground? Congrats on the great video!

  • @marcofernandez8723
    @marcofernandez8723 10 วันที่ผ่านมา

    Whats the autogen course link?

    • @elvissaravia
      @elvissaravia 10 วันที่ผ่านมา

      Here it is: www.deeplearning.ai/short-courses/ai-agentic-design-patterns-with-autogen/

  • @user-lj1gu2zt1s
    @user-lj1gu2zt1s 11 วันที่ผ่านมา

    Good

  • @Chris-vm2wf
    @Chris-vm2wf 13 วันที่ผ่านมา

    Thank you Elvis! Like you scientific non clickbaitie approach!

    • @elvissaravia
      @elvissaravia 11 วันที่ผ่านมา

      Thank you for that comment. I am happy to hear the approach resonates.

  • @thanhdatnguyen8143
    @thanhdatnguyen8143 16 วันที่ผ่านมา

    keep up the great work!

  • @covertassassin1885
    @covertassassin1885 17 วันที่ผ่านมา

    Thanks for the great content!

  • @minhsp3
    @minhsp3 17 วันที่ผ่านมา

    GOOD INDIAN ACCENT, VERY UNDERSTANDABLE BEST IS TO READ THE PAPER

  • @QR_Dev
    @QR_Dev 27 วันที่ผ่านมา

    I love the format, some things I'm not going to read further but I'm so glad to get a quick summary on them. Keep it going! 👍

  • @pokerandphilosophy8328
    @pokerandphilosophy8328 28 วันที่ผ่านมา

    7:15 "I notice the formatting is not so great here..." If you click on the button "Add to conversation" below, it will appear formatted correctly in the left panel. (You can then delete it if you want to regenerate the response and/or modify the query.)

  • @sfilkin
    @sfilkin 29 วันที่ผ่านมา

    You are very good at noticing news that deserves attention! I would like to see a weekly digest of the most important technologies and detailed analyses of some of them during the week.

  • @abduljaweed2886
    @abduljaweed2886 29 วันที่ผ่านมา

    Thanks for sharing Elvis

  • @kgnet8831
    @kgnet8831 หลายเดือนก่อน

    Great video👍 A deep dive on xLSTM would be cool😀

  • @vaibhavnakrani2983
    @vaibhavnakrani2983 หลายเดือนก่อน

    Awesome 🙌🏻

  • @user-is3de6ww9p
    @user-is3de6ww9p หลายเดือนก่อน

    Please explain 1-Bit LLM in detail.

    • @elvissaravia
      @elvissaravia หลายเดือนก่อน

      Will try to do deep dive on this at some point. I have a few I am working on already but yes this is a very interesting development in LLMs.

  • @elvissaravia
    @elvissaravia หลายเดือนก่อน

    Hope you enjoy! Also let me know if you would like me to do a deep dive on any of the topics.

    • @ritvikrastogi4912
      @ritvikrastogi4912 หลายเดือนก่อน

      Replacing breakfast time news with your videos 🎉

  • @super7ace
    @super7ace หลายเดือนก่อน

    Awesome. can you upload more videos please?

    • @elvissaravia
      @elvissaravia หลายเดือนก่อน

      Working on it.

  • @gopsda
    @gopsda หลายเดือนก่อน

    You can do this at 1x or normal frame rate instead. Not able to listen what you say about the paper, no clarity.

    • @elvissaravia
      @elvissaravia หลายเดือนก่อน

      Yeah. I messed up this one. I think I might have rendered it at higher frame rate.

  • @DonBranson1
    @DonBranson1 หลายเดือนก่อน

    Amazing video. Thanks for the concise review of key concepts.

  • @Xaelum
    @Xaelum หลายเดือนก่อน

    In a way, this feels like optimizing the decoding process during training (think beam search + gradient descent). Very cool paper and very interesting video!

  • @WesRoth
    @WesRoth หลายเดือนก่อน

    I mentioned this paper in a video that I will post later today, I'll link to this video so people can go deeper. Love your content on Twitter, looking forward to watching your TH-cam stuff too!

    • @elvissaravia
      @elvissaravia หลายเดือนก่อน

      Appreciate it. Thanks for the kind words. Will keep posting more of these detailed paper reviews. It seems to be clicking with people.

    • @cajampa
      @cajampa หลายเดือนก่อน

      @@elvissaravia I came from Wes recommendation. This was a great watch, keep going. I know you will find your audience here growing. Also remember to tell people to subscribe. It is annoying to hear from all the TH-camrs you watch, but they all do it because it works.

  • @MoonWho78
    @MoonWho78 หลายเดือนก่อน

    Why couldn't you have had all 4 in the example on the first slide of your video? It's a great color mapping for everything else but you neglected to include context.

  • @imadyoubiidrissi85
    @imadyoubiidrissi85 หลายเดือนก่อน

    Heh, Badr is my brother :3 very proud of him. Thnx for the video

    • @elvissaravia
      @elvissaravia หลายเดือนก่อน

      Awesome. It’s an excellent paper. Kudos to all the authors.

  • @barzinlotfabadi
    @barzinlotfabadi หลายเดือนก่อน

    Awesome! 😁

  • @nitinleo1986
    @nitinleo1986 หลายเดือนก่อน

    Hello @Elvis, Thanks for summarizing this so nicely. I agree with your point that IR system can make it much better to use with small or large language model. I think with this we can use less complex language model and information retrieval to answer more complex questions. I believe that this is what we do ourselves as well, i.e., we use are knowledge whenever it is sufficient but when it is not and we identify those instances we access either external sources of knowledge or those compartmentalized knowledge in our brain to retrieve the information that a question requires to answer. This is a great set to actually make the large language model smaller but effective. Thanks for describing it so well. I am looking forward to the following articles.

  • @thewimo8298
    @thewimo8298 หลายเดือนก่อน

    Would love that to be a video series of interesting LLM paper summaries!

  • @Marcel.Hasslocher
    @Marcel.Hasslocher หลายเดือนก่อน

    Em PTBR th-cam.com/video/0VtGC_N3Rvk/w-d-xo.htmlsi=s5QDIeDhtLmYlZxb

  • @SaidiiyavinjamuriDivya-sc9zh
    @SaidiiyavinjamuriDivya-sc9zh หลายเดือนก่อน

    1st comment. How to integrate?

  • @lymphe
    @lymphe หลายเดือนก่อน

    Just curious why you decided to use GPT-3.5 Turbo and not any of the other options? :)

  • @user-vi8nf8sr3s
    @user-vi8nf8sr3s หลายเดือนก่อน

    Best video thank you.

  • @elvissaravia
    @elvissaravia หลายเดือนก่อน

    Let me know if you woud like to see more videos like this and deep dives in the future.

    • @Xaelum
      @Xaelum หลายเดือนก่อน

      I loved the video. Please continue adding more videos to TH-cam!

    • @LuckyMinotaur-00
      @LuckyMinotaur-00 หลายเดือนก่อน

      Keep sharing such a wonderful Information... If Possible then try to share some of the lec about LLM.

    • @elvissaravia
      @elvissaravia หลายเดือนก่อน

      @@LuckyMinotaur-00 I will work on that in future videos. Thank you!

    • @elvissaravia
      @elvissaravia หลายเดือนก่อน

      @@Xaelum I will keep pushing out more. Thank you!

  • @sfilkin
    @sfilkin หลายเดือนก่อน

    Great pick!

  • @sfilkin
    @sfilkin หลายเดือนก่อน

    Great finding! I will utilise it7

  • @elvissaravia
    @elvissaravia หลายเดือนก่อน

    Please like and comment if you find these short paper summaries useful.

  • @achrafash
    @achrafash หลายเดือนก่อน

    It’s an amazing guide thank you Elvis!

  • @elvissaravia
    @elvissaravia 2 หลายเดือนก่อน

    Hi everyone. I am trying out a new series of short summaries of interesting AI papers. Let me know if you like this format. :) Paper link: arxiv.org/abs/2404.03414

  • @juanpablorodriguezgonzalez5181
    @juanpablorodriguezgonzalez5181 2 หลายเดือนก่อน

    Thank you so much. Good video.

  • @rwang5688
    @rwang5688 3 หลายเดือนก่อน

    Great video - Thank you for putting this together. Quick question: is Data Augmented Generation the same as Retrieval Augmented Generation? They sure seem very similar in concept and implementation.

  • @tejasvibirdh4583
    @tejasvibirdh4583 3 หลายเดือนก่อน

    First of all Great Job!!! It really did help alot... also i request you to post more lessons... Or at least tell when the other chapters will be uploaded...

  • @user-jr4sm2pe5k
    @user-jr4sm2pe5k 6 หลายเดือนก่อน

    But could you pls tell why we cannot directly use the playground instead where we can give the prompt in natural language directly and get the response. without using the python code ?

  • @dameanvil
    @dameanvil 6 หลายเดือนก่อน

    00:41 🔑 Prompt engineering involves using instructions and context to leverage language models effectively for various applications beyond just language tasks. 02:18 🔍 Prompt engineering is crucial for understanding language model capabilities, applicable in research and industry, as highlighted by job postings emphasizing this skill. 03:37 🛠 Components of a prompt include instructions, context, input data, and output indicators, affecting the model's response, with elements like temperature and top P influencing model output diversity. 05:45 📚 Prompt engineering applies to various tasks like text summarization, question answering, text classification, role playing, code generation, and reasoning, showcasing diverse applications. 09:57 💻 Language models, like OpenAI's, exhibit impressive code generation abilities, handling queries from natural language prompts for tasks such as SQL query generation. 10:51 🤔 While language models can reason to an extent, specific prompts and techniques like Chain of Thought prompting aid in improving their reasoning capabilities, although it's an evolving field. 11:19 📝 The lecture delves into code examples and tools, showcasing how prompt engineering techniques are applied practically, using OpenAI's Python client and other tools. 19:34 🚀 Advanced techniques like Few Shot Prompts, Chain of Thought prompting, and Zero Shot Chain of Thought prompting boost performance on complex tasks by providing demonstrations and step-by-step reasoning instructions to the language model. 23:13 🌟 Prompt engineering is an exciting space where crafting clever prompts empowers language models, allowing for powerful capabilities and advancements in various applications. 23:27 🧠 Prompt engineering aims to improve language models for complex reasoning tasks, as these models aren't naturally adept at such tasks. 24:22 🗳 Self-consistency in prompting involves generating multiple diverse reasoning paths and selecting the most consistent answers, boosting performance on tasks like arithmetic and Common Sense reasoning. 25:16 🔍 Demonstrating steps to solve problems within prompts guides models to produce correct answers consistently. 26:37 📚 Using language models to generate knowledge for specific tasks has emerged as a promising technique, even without external sources or APIs. 30:15 🐍 Program-aided language models use interpreters like Python to generate intermediate reasoning steps, enhancing complex problem-solving. 32:35 🔄 React frameworks utilize language models and external sources interchangeably for reasoning traces, action plans, and task handling. 35:20 📊 Tools and platforms for prompt engineering offer capabilities for development, evaluation, versioning, and deployment of prompts. 40:08 🧰 Various tools allow combining language models with external sources or APIs for sophisticated applications, augmenting the generation process. 44:45 📝 Leveraging tools like Long-Chain allows building on language models by chaining and augmenting data for generating responses. 46:22 🧠 Prompt engineering involves combining react-based actions with language models, showcasing the observation, thought, and action sequence for varied tasks. 47:53 🛠 Updated and accurate information from external sources is crucial for prompt engineering applications, highlighting the importance of up-to-date data stores. 48:34 📊 Data augmentation in prompt engineering involves reliance on external sources and tools to generate varied content, requiring data preparation and formatting. 50:34 💬 Prompt engineering explores clever problem-solving techniques to engage language models effectively, like converting questions into different languages while maintaining context and sources. 52:40 ⚠ Model safety is a critical aspect of prompt engineering, focusing on understanding and mitigating language model limitations, biases, and vulnerabilities, including initiatives like prompt injections to identify system vulnerabilities. 55:12 🔒 Potential vulnerabilities like prompt injection, prompt leaking, and jailbreaking highlight risks of manipulating language model outputs, emphasizing the importance of reinforcing system safety measures. 58:30 🎯 Reinforcement Learning from Human Feedback (RLHF) aims to train language models to meet human preferences, emphasizing the relevance of high-quality prompt datasets in this training process. 01:00:06 🌐 Prompt engineering facilitates the integration of external sources into language models, enabling diverse reasoning capabilities and applications, particularly useful for scientific tasks requiring factual references. 01:01:27 🔄 Understanding emerging language model capabilities, such as thought prompting, multi-modality, and graph data handling, is a crucial area for future exploration and development in AI research.

  • @vashisht1
    @vashisht1 9 หลายเดือนก่อน

    Good Content.👌✌

  • @hernanperez912
    @hernanperez912 9 หลายเดือนก่อน

    I am a complete alien on this topic, yet I can see the value of your videos. Great job bro 👏

  • @abhipatil4844
    @abhipatil4844 9 หลายเดือนก่อน

    Awesome

  • @iduomollc
    @iduomollc 10 หลายเดือนก่อน

    Fantastic info. Thank you for your hard work.

  • @Yifzmagarki
    @Yifzmagarki 10 หลายเดือนก่อน

    LLM a stupid system that works like a parrot, into which an incredible amount of money has been poured, and now for this money the entire community is trying to revive the corpse with crutches, to make it think what it cannot initially think

  • @mishapatel2231
    @mishapatel2231 10 หลายเดือนก่อน

    Thank you for sharing this video.. it's really grate learning. Can you elaborate prompt engineering for Multiple Choice Question Answering task ?