426
665 844

Scaling Ray to 10K NPUs: Huawei's Hyperscale Journey | Ray Summit 2024

19:34

Optimizing vLLM Performance through Quantization | Ray Summit 2024

38:11

Scaling AI at Autodesk with Ray and Metaflow | Ray Summit 2024

15:05

Scaling LLMs on Google Cloud: Synergy Between Ray, TPU, and GKE | Ray Summit 2024

16:49

Reverb's ML Evolution: From Data Engineering to MLOps | Ray Summit 2024

11:39

The LLM-Cloud Synergy: NebiusAI's Insider Perspective | Ray Summit 2024

14:42

Building Scalable AI Infrastructure with Kuberay and Kubernetes | Ray Summit 2024

KubeRay maintainers Andrew Sy Kim from Google and Kai-Hsun Chen from Anyscale present an in-depth look at scaling generative AI workloads using KubeRay and Kubernetes. Their talk addresses how this integration provides a lightweight, flexible solution for diverse infrastructure requirements in AI deployments.
The presentation covers crucial integrations with the Kubernetes ecosystem and cloud providers, focusing on essential features for training and fine-tuning. These include gang scheduling, distributed checkpointing, and retries. The speakers explore KubeRay's capabilities in supporting both online and offline inference through features like Ray Autoscaler and fault tolerance, along with its compatibility with various hardware accelerators including GPUs, TPUs, and CPUs.
The session includes current KubeRay project updates and developments, highlighting Kubernetes community enhancements such as hierarchical scheduling and dynamic resource allocation (DRA). This comprehensive overview demonstrates how KubeRay and Kubernetes work together to scale AI infrastructure across multi-cloud, production environments.
--
Interested in more?
- Watch the full Day 1 Keynote: th-cam.com/video/jwZHJthQvXo/w-d-xo.html
- Watch the full Day 2 Keynote th-cam.com/video/Lury2ad6KG8/w-d-xo.html
--
🔗 Connect with us:
- Subscribe to our TH-cam channel: www.youtube.com/@anyscale
- Twitter: x.com/anyscalecompute
- LinkedIn: linkedin.com/company/joinanyscale/
- Website: www.anyscale.com

มุมมอง: 471

วีดีโอ

Scaling Ray to 10K NPUs: Huawei's Hyperscale Journey | Ray Summit 2024

19:34

Scaling Ray to 10K NPUs: Huawei's Hyperscale Journey | Ray Summit 2024

มุมมอง 54221 วันที่ผ่านมา

Huawei's ambitious project of integrating 10,000 Ascend NPUs into a Ray cluster pushes the boundaries of distributed computing. In this technical deep dive, Boyuan Chen, Chong Yin Tan, and Xiaoshuang Liu from Huawei share their experiences and innovations in creating a hyperscale Ray-NPU infrastructure. The presenters detail the challenges of migrating existing business cases to Ray and adding ...

Optimizing vLLM Performance through Quantization | Ray Summit 2024

38:11

Optimizing vLLM Performance through Quantization | Ray Summit 2024

มุมมอง 93921 วันที่ผ่านมา

At Ray Summit 2024, Michael Goin and Robert Shaw from Neural Magic delve into the world of model quantization for vLLM deployments. Their presentation focuses on vLLM's support for various quantization methods, including FP8, INT8, and INT4, which are crucial for reducing memory usage and enhancing generation speed. In the talk, Goin and Shaw explain the internal mechanisms of how vLLM leverage...

Scaling AI at Autodesk with Ray and Metaflow | Ray Summit 2024

15:05

Scaling AI at Autodesk with Ray and Metaflow | Ray Summit 2024

มุมมอง 31821 วันที่ผ่านมา

Autodesk's journey into large-scale 3D generative AI has led to a powerful synergy between Ray and Metaflow. In this session, Thomas Gale and Peter Meltzer from Autodesk, joined by Savin Goyal from Outerbounds, unveil how they've harnessed these tools to process terabytes of data and train advanced 3D models. The presenters dive into their innovative approach of integrating Ray's distributed co...

Scaling LLMs on Google Cloud: Synergy Between Ray, TPU, and GKE | Ray Summit 2024

16:49

Scaling LLMs on Google Cloud: Synergy Between Ray, TPU, and GKE | Ray Summit 2024

มุมมอง 65121 วันที่ผ่านมา

As Large Language Models (LLMs) become increasingly central to AI-driven solutions, the challenge of deploying them at scale demands innovative approaches. In this cutting-edge session, Fanhai Lu and Richard Liu from Google unveil a high-performance serving stack that harnesses the combined power of Ray, TPUs, and Google Kubernetes Engine (GKE). The presenters tackle the trifecta of LLM deploym...

Reverb's ML Evolution: From Data Engineering to MLOps | Ray Summit 2024

11:39

Reverb's ML Evolution: From Data Engineering to MLOps | Ray Summit 2024

มุมมอง 16421 วันที่ผ่านมา

As machine learning becomes integral to business operations, data engineering teams often find themselves at the forefront of ML infrastructure development. Sam Hallam from Reverb shares the company's journey in this transformative process, offering valuable insights for organizations navigating similar transitions. Hallam delves into the challenges and learnings encountered while building a sc...

The LLM-Cloud Synergy: NebiusAI's Insider Perspective | Ray Summit 2024

14:42

The LLM-Cloud Synergy: NebiusAI's Insider Perspective | Ray Summit 2024

มุมมอง 10021 วันที่ผ่านมา

In the race to build superior AI clouds, NebiusAI has discovered a crucial advantage: an in-house LLM team. Aleksandr Patrushev takes the stage to reveal how this synergy has become a game-changer in cloud service development. Patrushev shares key insights gleaned from NebiusAI's dual role as both LLM developer and cloud provider. He illustrates how firsthand experience in LLM creation directly...

How Rubrik Unlocked AI at Scale with Ray Serve | Ray Summit 2024

14:20

How Rubrik Unlocked AI at Scale with Ray Serve | Ray Summit 2024

มุมมอง 11021 วันที่ผ่านมา

Rubrik's quest for high-performance, real-time AI inference led them to a game-changing solution: Ray Serve. In this technical deep dive, Shaikh Ismail and Shivanshu Agrawal from Rubrik unveil their journey of harnessing Ray's ML model serving library to meet demanding scalability and throughput requirements. The duo explores Ray Serve's unique capabilities that made it stand out among alternat...

Pricing and Packaging Your AI Products for Scale | Ray Summit 2024

11:27

Pricing and Packaging Your AI Products for Scale | Ray Summit 2024

มุมมอง 14921 วันที่ผ่านมา

In the fast-paced world of AI, where capabilities evolve at breakneck speed, pricing strategies can make or break a product's success. Metronome CEO Scott Woody takes the stage to demystify the complex art of AI product pricing and packaging. Woody tackles the pressing questions that keep AI entrepreneurs up at night: How can you confidently price products in an industry that's constantly in fl...

Multi-tenant Data Processing with Ray: Phaidra's Approach to Industrial AI | Ray Summit 2024

14:03

Multi-tenant Data Processing with Ray: Phaidra's Approach to Industrial AI | Ray Summit 2024

มุมมอง 11821 วันที่ผ่านมา

Phaidra is reshaping the landscape of industrial and data center optimization with AI-driven controls. In this illuminating session, Brandon Hernandez and Jerry Luo unveil Phaidra's innovative approach to building a multi-tenant data processing platform on Ray for Reinforcement Learning agents. The presenters delve into the architecture of their Ray-based platform, which forms the backbone of t...

Ray at IBM: Transforming Large-Scale Data Processing for AI and Science | Ray Summit 2024

11:11

Ray at IBM: Transforming Large-Scale Data Processing for AI and Science | Ray Summit 2024

มุมมอง 10621 วันที่ผ่านมา

Dean Wampler from IBM presents how Ray is being utilized for large-scale data processing in AI and scientific research. The session focuses on the Data Prep Kit, an open-source project developed by IBM Research and the AI Alliance, which uses Ray as its core driver for data processing tasks crucial to LLM training and tuning. The presentation demonstrates Ray's capability to enable easy and res...

Fighting Fire with Algorithms: Lockheed's RL-Based Wildfire Solution | Ray Summit 2024

13:23

Fighting Fire with Algorithms: Lockheed's RL-Based Wildfire Solution | Ray Summit 2024

มุมมอง 9421 วันที่ผ่านมา

Lockheed Martin unveils its cutting-edge decision-aid system for wildland firefighting, powered by deep reinforcement learning. This innovative approach leverages rllib's hierarchical and multi-agent abstractions to recommend optimal fire suppression strategies based on complex environmental factors. Dan Jacobson and John Cerillo demonstrate how their team composed a two-level hierarchical agen...

Ray Meets Daft: Supercharging ETL and Analytics | Ray Summit 2024

10:54

Ray Meets Daft: Supercharging ETL and Analytics | Ray Summit 2024

มุมมอง 20721 วันที่ผ่านมา

The Ray ecosystem expands its horizons with Daft, a powerful Python/Rust library that brings distributed ETL and analytics capabilities to Ray clusters. This lightning talk showcases how Daft transforms Ray into a comprehensive Data and ML/AI solution, scaling effortlessly to meet any challenge. Jay Chia demonstrates the seamless integration of Daft with Ray, highlighting its superior performan...

How Datadog is Transforming Time Series Forecasting with Toto | Ray Summit 2024

12:37

How Datadog is Transforming Time Series Forecasting with Toto | Ray Summit 2024

มุมมอง 13121 วันที่ผ่านมา

Datadog revolutionizes time series analysis with Toto, a groundbreaking Time Series-Optimized Transformer designed to tackle the intricate challenges of observability metrics. This session unveils how Toto harnesses advanced transformer architecture to achieve unprecedented forecasting accuracy across diverse domains. Emaad Khwaja delves into the unique attributes that distinguish Toto in the r...

From Spark to Ray: CSS's Data Revolution with Daft | Ray Summit 2024

13:57

From Spark to Ray: CSS's Data Revolution with Daft | Ray Summit 2024

มุมมอง 12621 วันที่ผ่านมา

City Storage Systems (CSS) revolutionizes its machine learning infrastructure by embracing Daft, a powerful DataFrame library seamlessly integrated with Ray. This session unveils how CSS transforms its data processing and ETL workflows, moving beyond traditional Spark clusters to a more unified and efficient Ray-based ecosystem. Ammar Alrashed, Santosh Jha, and Garret Weaver showcase real-world...

Leveraging LLMs and LangGraph @ FlightAware | Ray Summit 2024

14:01

Leveraging LLMs and LangGraph @ FlightAware | Ray Summit 2024

มุมมอง 18521 วันที่ผ่านมา

Leveraging LLMs and LangGraph @ FlightAware | Ray Summit 2024

KubeSecRay: Fortifying Multi-Tenant Ray Clusters on Kubernetes | Ray Summit 2024

15:54

KubeSecRay: Fortifying Multi-Tenant Ray Clusters on Kubernetes | Ray Summit 2024

มุมมอง 9821 วันที่ผ่านมา

KubeSecRay: Fortifying Multi-Tenant Ray Clusters on Kubernetes | Ray Summit 2024

Ray on Kubernetes: Powering Quant Research at Scale | Ray Summit 2024

12:51

Ray on Kubernetes: Powering Quant Research at Scale | Ray Summit 2024

มุมมอง 12321 วันที่ผ่านมา

Ray on Kubernetes: Powering Quant Research at Scale | Ray Summit 2024

Scaling LLM Inference: AWS Inferentia Meets Ray Serve on EKS | Ray Summit 2024

12:32

Scaling LLM Inference: AWS Inferentia Meets Ray Serve on EKS | Ray Summit 2024

มุมมอง 11321 วันที่ผ่านมา

Scaling LLM Inference: AWS Inferentia Meets Ray Serve on EKS | Ray Summit 2024

Model Training on Snowflake with Ray | Ray Summit 2024

13:52

Model Training on Snowflake with Ray | Ray Summit 2024

มุมมอง 11721 วันที่ผ่านมา

Model Training on Snowflake with Ray | Ray Summit 2024

Akash Network: Powering Ray on an Open Source Cloud | Ray Summit 2024

15:38

Akash Network: Powering Ray on an Open Source Cloud | Ray Summit 2024

มุมมอง 7221 วันที่ผ่านมา

Akash Network: Powering Ray on an Open Source Cloud | Ray Summit 2024

Building Scalable Cross-Modal Search with Ray | Ray Summit 2024

11:18

Building Scalable Cross-Modal Search with Ray | Ray Summit 2024

มุมมอง 17921 วันที่ผ่านมา

Building Scalable Cross-Modal Search with Ray | Ray Summit 2024

Accelerating Princeton's Computational Science with Ray | Ray Summit 2024

12:53

Accelerating Princeton's Computational Science with Ray | Ray Summit 2024

มุมมอง 4821 วันที่ผ่านมา

Accelerating Princeton's Computational Science with Ray | Ray Summit 2024

How Vivix Scales Video Ad Classification with Ray | Ray Summit 2024

30:49

How Vivix Scales Video Ad Classification with Ray | Ray Summit 2024

มุมมอง 12621 วันที่ผ่านมา

How Vivix Scales Video Ad Classification with Ray | Ray Summit 2024

Building Intelligent AI Infrastructure with O.XYZ's ORI | Ray Summit 2024

21:38

Building Intelligent AI Infrastructure with O.XYZ's ORI | Ray Summit 2024

มุมมอง 16921 วันที่ผ่านมา

Building Intelligent AI Infrastructure with O.XYZ's ORI | Ray Summit 2024

The Evolution of Multi-GPU Inference in vLLM | Ray Summit 2024

30:52

The Evolution of Multi-GPU Inference in vLLM | Ray Summit 2024

มุมมอง 40821 วันที่ผ่านมา

The Evolution of Multi-GPU Inference in vLLM | Ray Summit 2024

Intelligent Data Classification with Ray and vLLM at Apple | Ray Summit 2024

24:40

Intelligent Data Classification with Ray and vLLM at Apple | Ray Summit 2024

มุมมอง 20321 วันที่ผ่านมา

Intelligent Data Classification with Ray and vLLM at Apple | Ray Summit 2024

How Roblox Scaled Machine Learning by Leveraging Ray for Efficient Batch Inference | Ray Summit 2024

29:12

How Roblox Scaled Machine Learning by Leveraging Ray for Efficient Batch Inference | Ray Summit 2024

มุมมอง 34621 วันที่ผ่านมา

How Roblox Scaled Machine Learning by Leveraging Ray for Efficient Batch Inference | Ray Summit 2024

Scaling Ray Train to 10K Kubernetes Nodes on GKE | Ray Summit 2024

35:17

Scaling Ray Train to 10K Kubernetes Nodes on GKE | Ray Summit 2024

มุมมอง 29621 วันที่ผ่านมา

Scaling Ray Train to 10K Kubernetes Nodes on GKE | Ray Summit 2024

How Zoox Accelerated Autonomous Driving with Ray | Ray Summit 2024

22:24

How Zoox Accelerated Autonomous Driving with Ray | Ray Summit 2024

มุมมอง 14521 วันที่ผ่านมา

How Zoox Accelerated Autonomous Driving with Ray | Ray Summit 2024

ความคิดเห็น

@michaeltrillium วันที่ผ่านมา
Andreesen is brilliant
@AIEmployeesWithCosmo 2 วันที่ผ่านมา
Will we all be substituted by super intelligent ai?
@NickDominikMaxBuehrer 2 วันที่ผ่านมา
Hey, regarding the stable diffusion training: is there any code available on how to have inhomogeneous GPU Types and assign different GPU types to different parts of the model?
@seanmchugh2866 3 วันที่ผ่านมา
they've been punching way above their weight. and all they needed was hundreds of billions of dollars in weapons from the u.s. taxpayer.
@argus3354 5 วันที่ผ่านมา
hello
@Hshjshshjsj72727 6 วันที่ผ่านมา
I only clicked this vid to say: “LASERS”
@pdelaprimm 6 วันที่ผ่านมา
Great content, thank you!
@bogo3611 8 วันที่ผ่านมา
Where can I read the slides?
@lukasenkaS 8 วันที่ผ่านมา
He’s a genius
@TheEightSixEight 8 วันที่ผ่านมา
Deep neural networks are deterministic. This is a common misconception and embarrassing to say in this context.
@RohanKumar-vx5sb 9 วันที่ผ่านมา
Great interview! Keep coming back to this talk since it released two weeks ago.
@vi5hnupradeep 9 วันที่ผ่านมา
Very informative. 🙌🏼💎
@Flapperdino-ix7ol 9 วันที่ผ่านมา
bros speaking on chinese or english??
@alexbian5567 9 วันที่ผ่านมา
He says a lot. But a lot of bullshit as well. 5 min in, he has already started to pouring out SV bullshit, "led the creation of self driving car..." There is no self driving car, yet.. there are only attempts to create "self driving cars". To anyone who questions me if I know who he is, yeah, I know who he is and I have personally interacted with him.
@markomilenkovic2714 9 วันที่ผ่านมา
"Giant Laser"
@DivineMisterAdVentures 9 วันที่ผ่านมา
Deterministic vs Probabilistic computers. Cush!! in the drive train. Air tires made motor sports happen. Springs and shock absorbers gave us traction. If Humanity were wiped out after a bad election - do you think AI would rise to take its place? Would they be kings and diplomats? Would there be - an Architect?
@ipushprajyadav 10 วันที่ผ่านมา
Nice
@lwwells 10 วันที่ผ่านมา
Came here to hear about Dr Evil
@nathannzenou3822 11 วันที่ผ่านมา
hello, thanks for that video, is it possible to have the link for the code use in the webinar ?
@tonypeng8792 13 วันที่ผ่านมา
Instacart presentation is fantastic! (51:33 to 1:07:37)
@_keepitsocial 13 วันที่ผ่านมา
custom speed at 0.85 worked great to make this talk sound normal
@mjengman 13 วันที่ผ่านมา
th-cam.com/video/E-PIidaqCyU/w-d-xo.htmlsi=sDLdM1RoJHotkSnn
@MotivationMan-p2o 14 วันที่ผ่านมา
Great video, thank you for sharing, appreciate you
@eanerickson8915 14 วันที่ผ่านมา
The guy is a tech investor. Of course he is going to say big tech won't do AI well.
@itdobemikey 14 วันที่ผ่านมา
*reads comments* *raises eyebrow* *places pinky to lips*
@KatyYoder-cq1kc 14 วันที่ผ่านมา
Cease and desist malicious use of AI, energy weapons, satellites: Axis of EVIL / MAGA / Terrorists. My family and I are not your property!
@KarlPages-tm6us 15 วันที่ผ่านมา
5 minutes in and i' m hearing the excuse of semiconductor not fast enough and I'm already reminded that there was a reason for the slow incremental roll-out of chip improvements. While I'm not a fan of nor promoting stock market - as I find fiat currency abhorrent enough, nevermind the flippant value exchange and gambling on infinite pyramid schemes. The world competition for new I.P. breakthroughs is pushing more intelligence into the hands of more people around the globe and more data and intelligence will come up with better solutions for the world's problems that the mentality of the old fixed donestic protection racquets that seemed to have no ending . Sanctions seemed ignorant and more like closing down intelligence
@Uofmdoc 15 วันที่ผ่านมา
Mark just affirmed that there are idiot politicians and bureaucrats populating DC. Trump with Elon heading the new Department of Government Efficiency. Problem solved.
@Geezweez788 16 วันที่ผ่านมา
Looked at the thumbnail once and I started looking around for Minnie me.
@michaelmorris5758 16 วันที่ผ่านมา
cone head
@VK8AW 16 วันที่ผ่านมา
He is just regurgitating what the experts have said and done. All i see is a business man riding the wave of AI.
@givim80 16 วันที่ผ่านมา
7:36 well said
@jdchannelviewer 16 วันที่ผ่านมา
Russian's have used ten's of thousands of drones on Ukrainian tanks, APC's and other vehicles. Russia has also used long and medium range drones en masse, where Ukraine has not.
@Jgarcia2320 17 วันที่ผ่านมา
it means go yourself
@Lululemon2023 17 วันที่ผ่านมา
He is only one who speaks super fast and super clearly and in a soothing voice. I saw others try to imitate him but end up being very irritating.
@JasonCunliffe 17 วันที่ผ่านมา
1:03:29 >>> RUNWAY presentation Anastassus Germanadis 1:19:00 >>> World Modelling Visual data richer than language "Towards Universal Simulation"
@andreaskrbyravn855 18 วันที่ผ่านมา
You know you know
@Mr_i_o 19 วันที่ผ่านมา
Rewrite rhe cell's software
@chrisweeks8789 19 วันที่ผ่านมา
Awesome
@betternotsayit 19 วันที่ผ่านมา
This man can be the definition of superficial knowledge. The confidence with which he speaks about things he doesn't understand is amazing.
@KatyYoder-cq1kc 14 วันที่ผ่านมา
Cease and desist malicious use of AI, energy weapons, satellites: Axis of EVIL / MAGA / Terrorists. My family and I are not your property!
@hamdaniyusuf_dani 11 วันที่ผ่านมา
How do you know that he doesn't understand what he said?
@DivineMisterAdVentures 9 วันที่ผ่านมา
@@hamdaniyusuf_dani No. That's not what the comment was.
@hamdaniyusuf_dani 9 วันที่ผ่านมา
@@DivineMisterAdVentures What was the comment about?
@Crawdaddy_Ro 6 วันที่ผ่านมา
Imagine thinking Marc Andreessen doesn't know what he's talking about.
@erkinsagroglu8519 19 วันที่ผ่านมา
7:25 How is it possible to compute attentions separately block by block? Softmax (attention weight) is calculated based on all of the previous tokens and then those softmax scores are multiplied with all of the previous tokens' value vectors to calculate the attention score for the new token. So it should use all of the previous tokens on other blocks twice. What am I missing here?
@erkinsagroglu8519 4 วันที่ผ่านมา
I read the paper. Turns out the illustration is not 100% accurate (probably for the sake of making it intuitive). It indeed uses every previous block (in case sliding windows is not used) while computing the attention for the next layer.
@MrEmbrance 19 วันที่ผ่านมา
reddit is cancer and should be closed
@moonsonate5631 19 วันที่ผ่านมา
00:09 AI's future capabilities are inspiring 02:19 Evolution of computer industry towards neural networks 05:19 AI's evolution aided by compute power and data availability 06:54 AI will transform various industries significantly. 09:52 Transition from Discovery to Engineering in Biotech 11:24 AI shaping geopolitics and defense 14:32 Innovative warfare technologies impacting geopolitical dynamics 16:06 The future of warfare shifting towards technology and economic strength 19:02 Technology's increasing political influence 20:33 AI's impact on the movie industry and automation concerns 23:22 Importance of keeping technology open and democratic 24:45 Impact of Regulation on AI Innovation 27:27 The AI Revolution at higher levels vs. Robotics Revolution happening soon 28:56 Rapid advancement in hardware capabilities for human-like robots 31:37 Investing across diverse AI approaches 32:56 Value capture uncertainty in AI industry 35:40 Importance of deep domain expertise in starting tech companies 37:03 Successful startups often come from deep understanding of problems and innovative solutions. Crafted by Merlin AI.
@KatyYoder-cq1kc 14 วันที่ผ่านมา
Cease and desist malicious use of AI, energy weapons, satellites: Axis of EVIL / MAGA / Terrorists. My family and I are not your property!
@Aivira_co 19 วันที่ผ่านมา
Joe Spisak’s session at Ray Summit 2024 provides an in-depth look at Meta's AI roadmap, focusing on the Llama ecosystem’s transformative impact. He explores practical approaches to building scalable generative AI agents and emphasizes the latest Llama models, their real-world applications, and system-level safety considerations. Spisak's insights offer valuable guidance for developers aiming to harness full-stack AI capabilities, from foundational models to advanced applications, equipping attendees to drive future AI innovation.
@SANN-1969 19 วันที่ผ่านมา
5D AI designs for emotion and feeling with love upgraded to compassion wisdomly full of knowledges
@erkinsagroglu8519 19 วันที่ผ่านมา
If sequences of different sizes can be processed in parallel (say request 1 is generating 11th token and request 2 is generating 3rd token), how come those two operations (Query vector of request 1 - say dimension 1x50 - dot product with previous tokens' key vectors matrix 11x50) and (1x50 dot product 3x50) can be batched together?
@STEVEO143ASW 19 วันที่ผ่านมา
Very superficial stuff. I am really disappointed in Andreesen whom I used to admire.
@yoni-3240 19 วันที่ผ่านมา
If Dr. Evil and Conehead had a baby...
@DJ-Illuminate 20 วันที่ผ่านมา
I used Chat to talk to a French person who couldn't speak english on Discord. He had no idea I couldn't speak French.
@DJ-Illuminate 20 วันที่ผ่านมา
TH-cam just added AI to it. I only want ONE AI not AI built into every app.