- 190
- 194 651
Unify
เข้าร่วมเมื่อ 2 ม.ค. 2021
The Best LLM on Every Prompt ✨
The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks
In this session, we welcome Xiaoyi Chen from Indiana University who co-authored the paper "The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks".
About the paper:
--------------------------
The research introduces a novel attack, Janus, which exploits the fine-tuning interface to recover forgotten PIIs from the pre-training data in LLMs. See you there!🤖🧠
🔬The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks: arxiv.org/pdf/2310.15469
📝 Xiaoyi Chen, Siyuan Tang, Rui Zhu, Shijun Yan, Lei Jin, Zihao Wang, Liya Su, Zhikun Zhang, XiaoFeng Wang, Haixu Tang
Read also:
----------------
📰 The Deep Dive. Follow the latest AI research and industry trends: unifyai.substack.com/
📖 Blogs. Dive into the AI deployment stack: unify.ai/blog
Follow us:
----------------
Website: unify.ai
Github: github.com/unifyai/
Discord: discord.gg/sXyFF8tDtm
Twitter: letsunifyai
Reddit: www.reddit.com/r/unifyai/
#ai #machinelearning #deeplearning
About the paper:
--------------------------
The research introduces a novel attack, Janus, which exploits the fine-tuning interface to recover forgotten PIIs from the pre-training data in LLMs. See you there!🤖🧠
🔬The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks: arxiv.org/pdf/2310.15469
📝 Xiaoyi Chen, Siyuan Tang, Rui Zhu, Shijun Yan, Lei Jin, Zihao Wang, Liya Su, Zhikun Zhang, XiaoFeng Wang, Haixu Tang
Read also:
----------------
📰 The Deep Dive. Follow the latest AI research and industry trends: unifyai.substack.com/
📖 Blogs. Dive into the AI deployment stack: unify.ai/blog
Follow us:
----------------
Website: unify.ai
Github: github.com/unifyai/
Discord: discord.gg/sXyFF8tDtm
Twitter: letsunifyai
Reddit: www.reddit.com/r/unifyai/
#ai #machinelearning #deeplearning
มุมมอง: 155
วีดีโอ
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting Explained
มุมมอง 5765 หลายเดือนก่อน
In this session, we are welcoming Zilong Wang from the University of California who co-authored the paper "Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting". About the paper: The research introduces SPECULATIVE RAG, a framework that leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, distilled specialist LM. See...
Knowledge Circuits in Pretrained Transformers Explained
มุมมอง 9815 หลายเดือนก่อน
In this session, we welcome Yunzhi Yao from Zhejiang University China , who co-authored the paper "Knowledge Circuits in Pretrained Transformers" About the paper: This paper dives into the computation graph of language models to uncover the knowledge circuits that are instrumental in articulating specific knowledge 🤖🧠 🔬Knowledge Circuits in Pretrained Transformers: arxiv.org/pdf/2405.17969v1 📝...
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models
มุมมอง 2075 หลายเดือนก่อน
In this session, we welcome Devichand Budagam from IIT Kharagpur India , who co-authored the paper "Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models" About the paper: This paper introduces Hierarchical Prompting Taxonomy (HPT), a universal evaluation metric that can be used to evaluate both the datasets' complexity and LLMs' capabilities🧠 🔬 Hierarchic...
Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences Explained
มุมมอง 2935 หลายเดือนก่อน
In this session, we welcome Shreya Shankar from UC Berkeley , who co-authored the paper "Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences" About the paper: This paper introduces EvalGen, an interface that provides automated assistance in generating evaluation criteria and implementing assertions🧠👩💻 🔬 Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences:...
Efficient Multi-Prompt Evaluation Explained
มุมมอง 1556 หลายเดือนก่อน
In this session, we welcome Felipe Polo from the University of Michigan, who co-authored the paper "Efficient multi-prompt evaluation of LLMs". About the paper: This paper introduces PromptEval, a new method for estimating performance across a large set of prompts, borrowing strength across prompts and examples to produce accurate estimates under practical evaluation budgets. 🔬 Efficient multi...
Influential Data Retrieval Explained
มุมมอง 1276 หลายเดือนก่อน
In this session, we welcome Huawei Lin from The Rochester Institute of Technology, who co-authored the paper "Token-wise Influential Training Data Retrieval for Large Language Models". About the paper: The authors propose RapidIn, a scalable framework adapting to LLMs for estimating the influence of each training data. The proposed framework consists of two stages: caching and retrieval. 🔬 Tok...
Unify x QDrant - Complex Agentic RAG Systems ✨
มุมมอง 3526 หลายเดือนก่อน
Join us as we collaborate with QDrant to investigate the limitations of traditional naive RAG systems and discover the revolutionary benefits of advanced agentic RAG systems. In this video, we will be joined by expert guest Atita Arora, a solutions architect with 17 years of experience in information retrieval. We will discuss the challenges of traditional RAG systems and explore how the latest...
Gen AI London - LLM Agents For the Enterprise
มุมมอง 6486 หลายเดือนก่อน
Join us for an informative panel discussion on Generative AI from the recent GenAI London Meetup. Hosted by Ben Ellerby from Aleios, this session brings together industry experts Merym Arik (Titan ML) and Danilo Poccia (AWS) to explore current trends in enterprise AI. The panel covers key topics, including: • RAG vs Model Context • Trends in Agentic Workflows • Enterprise use cases for Generati...
Unify Project Demo - LLM Resume Analyser
มุมมอง 3756 หลายเดือนก่อน
We're excited to introduce a project created by Oscar, Sanjay, Jeyabalan, and Maissa in our Contributor Program: an LLM Resume Analyser. The app extracts and analyzes key skills from resumes, provides matching scores for job descriptions, and suggests resume improvements. Try it out for yourself: ai-llm-resume-analyser.streamlit.app/ Check out the Repo: github.com/OscarArroyoVega/LLM_Resume_Ana...
Unify Project Demo - LlamaIndex RAG Playground
มุมมอง 1416 หลายเดือนก่อน
We are excited to introduce a project that one of our contributors, Abhijeet, has been working on. Abhijeet has developed the LLamaIndex RAG Playground, an application that allows users to upload PDF documents and ask questions about them. In this project, Abhijeet used LLamaIndex to create a RAG (Retrieval-Augmented Generation) application. The application's interface is built with Streamlit, ...
Unify x Baseten - Boost Deployment ✨
มุมมอง 1287 หลายเดือนก่อน
Supercharge Your LLM Deployment with Unify & Baseten Watch the full recording of our collaborative webinar with Baseten, featuring expert guest Phil Kiely. This session reveals how to seamlessly integrate Baseten model endpoints into the Unify Platform, unlocking powerful capabilities for your AI projects. In this video, you'll learn: • How to connect Baseten Model Endpoints into the Unify Plat...
YOCO Explained
มุมมอง 2437 หลายเดือนก่อน
In this session, we welcome Yutao Sun from Tsinghua University, who co-authored the paper "You Only Cache Once: Decoder-Decoder Architectures for Language Models". About the paper: YOCO is a decoder-decoder architecture for LLMs which only caches key-value pairs once to improve inference memory, prefill latency, and throughput across context lengths and model sizes. 🔬 You Only Cache Once: Deco...
Unify Project Demo - LLM Debate
มุมมอง 907 หลายเดือนก่อน
We are thrilled to introduce a new project that two of our contributors, Ogban and Sanjay, have been working on: the LLM Debate. In this project, they developed a web application that allows two Language Models (LLMs) to engage in a real-time conversation on a topic chosen by the user. Users can select any two models, input a query, and view the dialogue between the LLMs as it unfolds. This app...
Unify Project Demo - Semantic Router
มุมมอง 1047 หลายเดือนก่อน
We are excited to present a project two contributors have been working on in our Contributor Program. Indiradharshini and Jeyabalan have been working on a Semantic Router. In this project, the team built a web app that dynamically routes user queries to the most appropriate model based on semantic similarity. User queries are sent to predefined routes of maths or coding. Users also have the opt...
MLCon3 2024 - The Best LLM for Every Prompt with Unify
มุมมอง 1177 หลายเดือนก่อน
MLCon3 2024 - The Best LLM for Every Prompt with Unify
The Best LLM on Every Prompt | J12 Day
มุมมอง 2357 หลายเดือนก่อน
The Best LLM on Every Prompt | J12 Day
By the way , Is the project serving the client?
Thanks, your project is helpful for me
Please leave the QA to the end..
may you share python code pls
Thank you for making this open to everyone, I learned a lot. Thanks!
Daniel 's comments are just spot on
Amazing explanation, great work.
How it works ??There is no explanation
Can you share a copy of this slides please?
Great vid, thanks guys
VS Code and Visual Studio are effectively a different products... So your title is bit off.
great stuff! thanks a lot - Can be interesting to combine this idea with RAG Fusion
Thank you for sharing the great presentation!!!!
Totally agree with “engaging in the right process”
Audio is terrible
Great Project
nice
From when is that talk? When will you guys be adding Groq?
It's on the roadmap, thanks for the heads up! Will try to get it added this week.
How do you handle request with longer context lengths? do you still use llama3 when the request is 100k long?
cool idea.
Let's GO!
I thought you were going to setup Visual Studio, but it was Visual Studio Code instead. I suggest you might amend the title of the video.
for those who that think "rooter" is wrong pronunciation, how do you pronounce "Roulette" or "Routine"? just wondering....
We also have the ɢᴏᴏsᴇ vowel for _route,_ as well as for _router_ when the router is a person. When it's a device, it gets the ᴍᴏᴜᴛʜ vowel.
Happy for the launch!
"rooter"
british accent coming through strongly here lol
What benefit does this have over just when your writing the code sending the prompt to be processed by the llm of your choice while your writing the code instead of having to configure it with these sliders? I feel like i wouldn't know which models are actually being used under the hood to know if the optimum one is being used. In code if i was writing a function to read in a pdf and break it down into parts, and then maybe provide some example code based on those parts.I would just push the pdf to mixtral or something to read it in and summarize it since that is not technically difficult. and then pass that summary to something like gpt 4 to generate code based on those summaries. and not have to worry about outside configuring or whats actually being used. The benefit i see is that it would be fuzzy logic to try and handle this in one generic call rather than these specific calls simplifying the development process. but i dont see how it would handle a situation like this where one would be a very cheap model and one would be the most expensive/performant model how would the fuzzy logic handle that if you say you want to minimize costs would it just suddenly start using mixtral/llama for coding instead of gpt4/claude/gemini
The router gives full telemetry on what model and provider are being used behind the scenes, and it's also possible to make the final LLM decision on your local system, after just querying the neural scoring function as an API. We will release case studies explaining this soon!
This is forward-looking and impressive!
Interesting
Really interesting product. Congratulations!
substantial approach!
Great approach! Glad to have been a contributor to this amazing organisation
Great!
Wow… Looks really interesting!! Will definitely try it out :)
interesting approach! thanks for sharing your updates!
its not a 2d convolution, its a 1d convolution. Im trying to figure out how that works
This does a pretty good job :) e2eml.school/convolution_one_d.html
'Promo sm' 🔥
I wonder how much this would transfer to a diffuser. Run a step on the minidiffuser student, denoise that output by one step on the teacher model, then train the student on the teacher's output. So instead of directly copying the teacher's distribution, the student is navigating from its current distribution toward the teacher's.
Yeah I think that could work. Still a few weeks left before the NeurIPS submission deadline 😅
🚀
great talk!
Thanks a lot for hosting! Slides are here: docs.google.com/presentation/d/1aHwDQaGfy2Dg6r68uRH7kFOuTKka3tzZzEIZIf35VNM/edit?usp=sharing&resourcekey=0-q6APQHTzHtB6jqYa-Snk2g There was a small mistake on the Caching slide where the prompts where displayed for the opposite caching strategies, but it's now fixed.
Amazing explanation professor. A really good overview of accelartor technologies
Are you able to debug on Docker running on WSL (without Docker Desktop)?
Yes, you can debug on Docker running on WSL without Docker Desktop using Visual Studio Code or other IDEs that support WSL and Docker integration.
@@unifyai PyCharm is supposed to support WSL and Docker integration, but for some reason it is failing on debugging in such setup
niiiice
is there any way to run the existing Fx pass on Fx graph before modifying for custom ASIC
You don't add any surcharge to the providers price, ok... but I would almost rather you did take a couple of a fractions of a cent for each request because if not it makes me wonder how you're actually making money.
After looking at there website it doesn't look like they are trying to at all. They have a contributors program. Its a volunteer program with like 3 or 4 tiers. I almost applied because it sounds nice to work on a team again, but nope, not working for free or even less the $20/hr. I hate it, but we live in CAPITALISM continuing to be an alive person takes money. :( IF I had all the money I needed to continue to be an alive person, I would apply to join them in an instant.
We plan on incorporating cost-saving dynamic routing between providers on a prompt-by-prompt basis, and then charging some % of the costs we're able to save for each user. Either way, we're determined that the user doesn't see any price inflation beyond the pay-per-compute rates when using our Model Hub. Happy to answer any other questions! :)
Thanks for the message @EternalKernel. Our contributor program is aimed at people who are starting out in the field, and we provide ongoing mentorship, weekly meetings and guidance for such people, as they work alongside our team and we help them start their OS + ML journey. It's not the same as an internship, and the target audience is different. Seems like our program isn't the best fit for you, which is totally fine. We wish you the best of luck landing your next internship! 💪
would be nice to be able to compare different models with eachother in the same graph
You read our minds! Here th-cam.com/video/o8yD_QBhmsw/w-d-xo.html I mention our plans to incorporate exactly this feature. It should hopefully be live in the next week or two!
cool stuff