Evaluation // Panel 1 // Large Language Models in Production Conference Part 2

Evaluating LLM-based Applications

The Emerging Toolkit for Reliable, High-quality LLM Applications // Matei Zaharia //LLMs in Prod Con

แข่งกันกินอาหารตามตัวอักษรภาษาอังกฤษ A-Z!! ใครกินหมดก่อนชนะ!!

เปิดบ้านสุดหรู "เเจ็ค the ghost" ลั่นถ้าป๋องกพลลำบากผมเลี้ยงเอง!! l [Nickynachat]

ถ่ายทอดสด เรื่องเล่าเช้านี้ วันที่ 25 กรกฎาคม 2567

Evaluating LLM-based Applications // Josh Tobin // LLMs in Prod Conference Part 2

MLOps.community

มุมมอง 4 597

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 11 ก.ค. 2023
This portion is sponsored by Gantry.
Website: gantry.io/
A simple, powerful SDK for model instrumentation
Gantry's SDK gives you easy access to all of your production data and metrics, just by adding a few lines of code.
//Abstract
Evaluating LLM-based applications can feel like more of an art than a science. In this workshop, we'll give a hands-on introduction to evaluating language models. You'll come away with knowledge and tools you can use to evaluate your own applications, and answers to questions like:
Where do I get evaluation data from, anyway?
Is it possible to evaluate generative models in an automated way? What metrics can I use?
What's the role of human evaluation?
//Bio
Josh Tobin is the founder and CEO of Gantry. Previously, Josh worked as a deep learning & robotics researcher at OpenAI and as a management consultant at McKinsey. He is also the creator of Full Stack Deep Learning (fullstackdeeplearning.com), the first course focused on the emerging engineering discipline of production machine learning. Josh did his PhD in Computer Science at UC Berkeley advised by Pieter Abbeel.
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น •

ต่อไป

เล่นอัตโนมัติ

Evaluation // Panel 1 // Large Language Models in Production Conference Part 2

Evaluation // Panel 1 // Large Language Models in Production Conference Part 2

Evaluating LLM-based Applications

Evaluating LLM-based Applications

The Emerging Toolkit for Reliable, High-quality LLM Applications // Matei Zaharia //LLMs in Prod Con

The Emerging Toolkit for Reliable, High-quality LLM Applications // Matei Zaharia //LLMs in Prod Con

แข่งกันกินอาหารตามตัวอักษรภาษาอังกฤษ A-Z!! ใครกินหมดก่อนชนะ!!

แข่งกันกินอาหารตามตัวอักษรภาษาอังกฤษ A-Z!! ใครกินหมดก่อนชนะ!!

เปิดบ้านสุดหรู "เเจ็ค the ghost" ลั่นถ้าป๋องกพลลำบากผมเลี้ยงเอง!! l [Nickynachat]

เปิดบ้านสุดหรู "เเจ็ค the ghost" ลั่นถ้าป๋องกพลลำบากผมเลี้ยงเอง!! l [Nickynachat]

ถ่ายทอดสด เรื่องเล่าเช้านี้ วันที่ 25 กรกฎาคม 2567

ถ่ายทอดสด เรื่องเล่าเช้านี้ วันที่ 25 กรกฎาคม 2567

Gặp Nguyên Team Thì Thua #roblox #minhmama #bloxfruits

Gặp Nguyên Team Thì Thua #roblox #minhmama #bloxfruits

Neural Notes: ColBERT & ColBERTv2

Neural Notes: ColBERT & ColBERTv2

[Webinar] LLMs for Evaluating LLMs

[Webinar] LLMs for Evaluating LLMs

Building LLM Applications for Production // Chip Huyen // LLMs in Prod Conference

Building LLM Applications for Production // Chip Huyen // LLMs in Prod Conference

LLMOps (LLM Bootcamp)

LLMOps (LLM Bootcamp)

Emerging architectures for LLM applications

Emerging architectures for LLM applications

Evaluation for Large Language Models and Generative AI - A Deep Dive

Evaluation for Large Language Models and Generative AI - A Deep Dive

A Survey of Techniques for Maximizing LLM Performance

A Survey of Techniques for Maximizing LLM Performance

LLMOps: Everything You Need to Know to Manage LLMs

LLMOps: Everything You Need to Know to Manage LLMs

Some bad code just broke a billion Windows machines

Some bad code just broke a billion Windows machines

New setup part 3: There's still a lot to add #setup #gamer #gameroom #techhouse #gamingtech

New setup part 3: There's still a lot to add #setup #gamer #gameroom #techhouse #gamingtech

Samsung laughing on iPhone #techbyakram

Samsung laughing on iPhone #techbyakram

It's very relaxing#desksetup #desk #desktop #venom #pickup

It's very relaxing#desksetup #desk #desktop #venom #pickup

How NVIDIA just beat every other tech company

How NVIDIA just beat every other tech company

40$ or 50$ or Typecase iPad keyboard #ipadkeyboard #ipadcase #typecase #ipad #ipadpro

40$ or 50$ or Typecase iPad keyboard #ipadkeyboard #ipadcase #typecase #ipad #ipadpro

ประวัติ Nokia สมัยยังรุ่งโรจน์จนล่ำลา😭 #ความรู้ #การเงิน #ธุรกิจ #nokia

ประวัติ Nokia สมัยยังรุ่งโรจน์จนล่ำลา😭 #ความรู้ #การเงิน #ธุรกิจ #nokia

#Home Improvement Water and Electricity Installation of three openings and two controls

#Home Improvement Water and Electricity Installation of three openings and two controls