- 372
- 1 281 786
Gradient Flow
United States
เข้าร่วมเมื่อ 3 พ.ค. 2020
Gradient Flow presents a rich array of high quality content on data, technology and business, with a focus on machine learning and AI. Named by Coursera as one of the Top 10 Sites for Data Scientists, Gradient Flow helps you stay ahead on the latest technology trends and tools with in-depth coverage, analysis and insights.
Learn the latest trends and best practices in data, technology and business, with a focus on machine learning and AI.
gradientflow.com/subscribe/
Learn the latest trends and best practices in data, technology and business, with a focus on machine learning and AI.
gradientflow.com/subscribe/
Paradigm Shifts in Data Processing for the Generative AI Era: Robert Nishihara of Anyscale & Ray.io
Episode Notes: thedataexchange.media/ray-data/
Robert Nishihara is co-founder of Anyscale and co-creator of Ray, the open source project that has emerged as the AI Compute Engine.
** Sections **
The Paradigm Shift: Data's Role in AI - 00:01:42
Challenges of Multi-Modal Data Tooling - 00:05:24
AI-Centric Workloads and Evolving Data Pipelines - 00:08:15
Scaling and Distributed Processing for Unstructured Data - 00:11:03
Enterprise Adoption of AI and Infrastructure Challenges - 00:13:30
Experimentation in AI: From Data Collection to Evaluation - 00:16:13
AI Development Lifecycle: From Ideation to Production - 00:18:30
The Next Data Type to Dominate AI: Images and Video - 00:19:25
Understanding and Mining Video Data - 00:21:36
Scaling Laws and Their Current Limitations - 00:24:12
Improving Data Quality and Reasoning Capabilities - 00:26:53
Looking Ahead to AI Developments in 2025 - 00:28:29
The Role of Post-Training in Enhancing Foundation Models - 00:30:37
Future Predictions: The Evolution of Foundation Models - 00:31:40
Robert Nishihara is co-founder of Anyscale and co-creator of Ray, the open source project that has emerged as the AI Compute Engine.
** Sections **
The Paradigm Shift: Data's Role in AI - 00:01:42
Challenges of Multi-Modal Data Tooling - 00:05:24
AI-Centric Workloads and Evolving Data Pipelines - 00:08:15
Scaling and Distributed Processing for Unstructured Data - 00:11:03
Enterprise Adoption of AI and Infrastructure Challenges - 00:13:30
Experimentation in AI: From Data Collection to Evaluation - 00:16:13
AI Development Lifecycle: From Ideation to Production - 00:18:30
The Next Data Type to Dominate AI: Images and Video - 00:19:25
Understanding and Mining Video Data - 00:21:36
Scaling Laws and Their Current Limitations - 00:24:12
Improving Data Quality and Reasoning Capabilities - 00:26:53
Looking Ahead to AI Developments in 2025 - 00:28:29
The Role of Post-Training in Enhancing Foundation Models - 00:30:37
Future Predictions: The Evolution of Foundation Models - 00:31:40
มุมมอง: 607
วีดีโอ
Data Exchange Podcast (Episode 264): Ben Lorica and Paco Nathan
มุมมอง 86221 ชั่วโมงที่ผ่านมา
Episode Notes: thedataexchange.media/roundup-2024-12/ Sections State of semiconductor export restrictions - 00:00:38 Commercial vs. Open AI Models: Analyzing Market Trends and Development Plateaus - 00:13:08 Semiconductor Manufacturing Intel - 00:28:10 The Intersection of Physics, Chemistry, and AI Development - 00:35:35 Quick Hits: Bluesky and Crypto - 00:40:01 Quick Hits: AI Apps (Sound Isola...
Data Exchange Podcast (Episode 263): Qian Li and Peter Kraft of DBOS
มุมมอง 1.1K14 วันที่ผ่านมา
Episode Notes: thedataexchange.media/dbos/ Sections Core Problems Addressed by DBOS - 00:00:48 Open Source and Durable Execution - 00:01:23 Error Handling and Recovery Challenges - 00:02:28 Demonstration: Building an AI Refund Agent - 00:03:40 Real-Time Data Access in AI Agents - 00:06:02 DBOS in Data Pipelines and RAG Use Cases - 00:15:00 Integration with Postgres and Hosting Options - 00:16:3...
Data Exchange Podcast (Episode 262): Shreya Rajpal of Guardrails AI
มุมมอง 1.2K21 วันที่ผ่านมา
Episode Notes: thedataexchange.media/the-essential-guide-to-ai-guardrails/ Shreya Rajpal isCEO and Cofounder, Guardrails AI , and co-creator of the popular open source project, Guardrails, a Python framework that helps build reliable AI applications. Sections What Are AI Guardrails? Definitions and Applications - 00:00:51 Early Adoption and Industry Awareness of Guardrails - 00:04:00 Exploring ...
Data Exchange Podcast (Episode 261): Deepti Srivastava of Snow Leopard
มุมมอง 87528 วันที่ผ่านมา
Episode Notes: thedataexchange.media/snow-leopard/ Deepti Srivastava is the Founder and CEO of Snow Leopard. We dive into Snow Leopard’s innovative approach to data integration, exploring its live data access model that bypasses traditional ETL pipelines to offer real-time data retrieval directly from source systems. Sections Introduction to Snow Leopard and its Mission - 00:04:00 The Value of ...
Understanding Visual Language Models
มุมมอง 61หลายเดือนก่อน
Learn more ⇶ gradientflow.com/understanding-visual-language-models
Data Exchange Podcast (Episode 260): 2024 Generative AI in Healthcare Survey Results w/ David Talby
มุมมอง 1.2Kหลายเดือนก่อน
Episode Notes: thedataexchange.media/2024-generative-ai-in-healthcare-survey-results/ In this episode David Talby (CTO of John Snow Labs) and I present the results of the 2024 Generative AI in Healthcare Survey. Sections Introduction and Survey Overview - 00:00:00 Budgeting and Investment in Generative AI - 00:04:48 Large Language Models (LLMs) in Healthcare - 00:05:31 Use Cases and Challenges ...
The Paradigm Shift Transforming Data Processing & Data Preparation for AI
มุมมอง 95หลายเดือนก่อน
Read the article ⇶ gradientflow.substack.com/p/paradigm-shifts-in-data-processing
Data Exchange Podcast (Episode 259): Monthly Roundup with Ben Lorica & Paco Nathan
มุมมอง 1Kหลายเดือนก่อน
Episode Notes: thedataexchange.media/roundup-2024-11/ *Sections* BAML - 00:00:53 AI and Biotech - 00:12:45 Tencent Hunyuan Large - 00:19:11 AR/VR and AI at Disney - 00:25:24 Amazon Prime Air Delivery (drones) - 00:31:18 Voice models: F5-TTS and Moonshine - 00:39:37 Infrastructure: storing vector embeddings; Why AI Workloads Challenge Kubernetes - 00:41:41 Entity Linking; Multimodal LLMs - 00:51...
Data Exchange Podcast (Episode 258): Vasant Dhar of NYU
มุมมอง 902หลายเดือนก่อน
Episode Notes: thedataexchange.media/building-the-future-of-finance-inside-ai-valuation-bots/ Vasant Dhar is a Professor at the Stern School of Business and the Center for Data Science at NYU. He’s one of the creators of the Damodaran Bot, an AI-powered system designed to emulate the valuation analysis and investment insights of renowned finance professor Aswath Damodaran. Sections Early Machin...
Data Exchange Podcast (Episode 257): Vaibhav Gupta of Boundary and BAML
มุมมอง 3.4Kหลายเดือนก่อน
Episode Notes: thedataexchange.media/baml/ Vaibhav Gupta is the CEO and co-founder of Boundary and co-creator of BAML. Sections Extracting Structured Data from LLMs - 00:01:24 Pivot from RAG to BAML for Better Data Results - 00:03:22 Challenges in RAG Pipelines and BAML’s High-Quality Data Approach - 00:04:08 Overview of BAML’s Information Extraction Capabilities - 00:05:07 Reducing Token Usage...
Data Exchange Podcast (Episode 256): Tim Persons of PwC
มุมมอง 9862 หลายเดือนก่อน
Episode Notes: thedataexchange.media/tim-persons-2024-07/ In this conversation with Tim Persons, AI Leader at PwC, we explore the current landscape of generative AI adoption, examining how enterprises are navigating budget trends, moving from experimentation to full-scale deployment, and addressing cultural challenges along the way. Sections Overview: Adoption of Generative AI in Enterprises - ...
Data Exchange Podcast (Episode 255): Monthly Roundup with Ben Lorica & Paco Nathan
มุมมอง 1.1K2 หลายเดือนก่อน
Episode Notes: thedataexchange.media/roundup-2024-10/ Sections Ray Compiled Graphs - 00:00:43 SB 1047 Is Vetoed, what next? - 00:07:57 Structure Is All You Need: enhancing RAG with structured & contextual information - 00:13:41 Llama 3.2 & the state of frontier model developers - 00:27:40 vLLM and Ray Data: status reports - 00:39:04 The Bigger-is-Better Paradigm in AI - 00:44:35 Recommendations...
Data Exchange Podcast (Episode 254): Matt Welsh of Aryn AI
มุมมอง 1.3K2 หลายเดือนก่อน
Episode Notes: thedataexchange.media/matt-welsh-2024-09/ Matt Welsh is a technical leader at Aryn AI, an AI-powered ETL system for RAG frameworks, LLM-based applications, and vector databases. Sections The Changing Nature of Programming - 00:02:24 Trusting AI to Build Pipelines - 00:04:29 The Democratization of Programming - 00:07:04 The Role of LLMs in Programming - 00:10:42 Challenges in Inte...
Data Exchange Podcast (Episode 253): Mars Lan of Metaphor
มุมมอง 1.9K2 หลายเดือนก่อน
Episode Notes: thedataexchange.media/the-security-debate-how-safe-is-open-source-software/ Mars Lan, Co-Founder & CTO at Metaphor1 an AI-powered social platform that enhances data governance by empowering all employees, not just data teams, to easily collaborate, search, and share insights through an intuitive, AI-driven interface. Sections Security and Vulnerabilities in Open Source Software -...
Data Exchange Podcast (Episode 252): Yishay Carmiel of Meaning.team
มุมมอง 2.4K3 หลายเดือนก่อน
Data Exchange Podcast (Episode 252): Yishay Carmiel of Meaning.team
Data Exchange Podcast (Episode 251): Aurimas Griciūnas of Neptune.ai
มุมมอง 3.3K3 หลายเดือนก่อน
Data Exchange Podcast (Episode 251): Aurimas Griciūnas of Neptune.ai
Data Exchange Podcast (Episode 250): Monthly Roundup with Paco Nathan
มุมมอง 1.1K3 หลายเดือนก่อน
Data Exchange Podcast (Episode 250): Monthly Roundup with Paco Nathan
Data Exchange Podcast (Episode 249): Petros Zerfos and Hima Patel of IBM Research and Data Prep Kit
มุมมอง 1.4K3 หลายเดือนก่อน
Data Exchange Podcast (Episode 249): Petros Zerfos and Hima Patel of IBM Research and Data Prep Kit
Data Exchange Podcast (Episode 248): Andrew Ng
มุมมอง 1.2K3 หลายเดือนก่อน
Data Exchange Podcast (Episode 248): Andrew Ng
Data Exchange Podcast (Episode 247): Jay Dawani, CEO and founder of Lemurian Labs
มุมมอง 4.6K4 หลายเดือนก่อน
Data Exchange Podcast (Episode 247): Jay Dawani, CEO and founder of Lemurian Labs
Data Exchange Podcast (Episode 246): Monthly Roundup with Paco Nathan
มุมมอง 1.8K4 หลายเดือนก่อน
Data Exchange Podcast (Episode 246): Monthly Roundup with Paco Nathan
Data Exchange Podcast (Episode 245): Evangelos Simoudis of Synapse Partners
มุมมอง 2.5K4 หลายเดือนก่อน
Data Exchange Podcast (Episode 245): Evangelos Simoudis of Synapse Partners
Data Exchange Podcast (Episode 244): Shuveb Hussain of Unstract
มุมมอง 2.4K4 หลายเดือนก่อน
Data Exchange Podcast (Episode 244): Shuveb Hussain of Unstract
Data Exchange Podcast (Episode 243): Alfred Spector of M.I.T.
มุมมอง 1.9K5 หลายเดือนก่อน
Data Exchange Podcast (Episode 243): Alfred Spector of M.I.T.
Data Exchange Podcast (Episode 242): Monthly Roundup with Paco Nathan
มุมมอง 1.7K5 หลายเดือนก่อน
Data Exchange Podcast (Episode 242): Monthly Roundup with Paco Nathan
Data Exchange Podcast (Episode 241): Andrew Burt of Luminos.Law and Luminos.ai
มุมมอง 3K5 หลายเดือนก่อน
Data Exchange Podcast (Episode 241): Andrew Burt of Luminos.Law and Luminos.ai
Data Exchange Podcast (Episode 240): Chang She of LanceDB
มุมมอง 2.1K5 หลายเดือนก่อน
Data Exchange Podcast (Episode 240): Chang She of LanceDB
Data Exchange Podcast (Episode 239): Ajay Kulkarni and Mike Freedman of Timescale, on vector search
มุมมอง 1K6 หลายเดือนก่อน
Data Exchange Podcast (Episode 239): Ajay Kulkarni and Mike Freedman of Timescale, on vector search
Data Exchange Podcast (Episode 238): Philip Rathle of Neo4j
มุมมอง 4.5K6 หลายเดือนก่อน
Data Exchange Podcast (Episode 238): Philip Rathle of Neo4j
Congrats Vaibhav, proud of you.
Thank you so much for this amazing video! Just a quick off-topic question: I have a SafePal wallet with USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How should I go about transferring them to Binance?
This software has made my life so much easier. Thanks for the recommendation!
This software is a lifesaver. Thanks for the detailed explanation!
You have studied my tecniques? Ok. May I help you?
This is awesome, and it's pretty difficult. I developed something recently for a specific problem, and it's annoying how easy it is to find yourself writing out an ETL pipeline essentially to enable the LLM when the goal was to enable an ensemble of LLMs to "figure it out." You have to constantly push back. It was essentially grabbing a line from a datatable and querying fairly messy data dumped from PDF files and websites, allowing fuzzy matches with a rank, taking a subset, and then comparing the information between these dissimilar documents to find one to three specific pieces of information. It then updates a datatable for machine learning models. It was a challenge and required a more intelligent LLM (unless you're going to be a software engineer for every single step in detail... then why do Agentic at all?). The solution cost about $9 and was pretty slow, but a human would have taken about six months. Do you think there are lots of people that are basically writing entire codebases of SQL queries, where an LLM becomes an unnecessary step or layer in the SQL query that a human spends writing from beginning to end, which is not what they wanted. The programmer becomes the servant to the LLM.
Isn't agent work as pipline?
Isn't a agent should setup a system capable of pushing and pulling data?
My understanding is that an agent should be capable of analyzing data from timeline
Any thoughts on Nvidia NIM?
I love how you made your mind map in this video! Could you share the tools or techniques you used to create it? I’d love to make mine look as professional as yours.
💐💐💐💐💐💐💐💐💐💐💐🇮🇳🇮🇳🇮🇳🙏🙏🙏🙏🙏❤❤❤
I think Paco makes a very interesting and crucial point at the 30:35 mark. These LLM-powered graph builders are creating graphs from unstructured data, but how much domain knowledge do they possess to build truly sensible and accurate graphs in specific areas? For example, if I work with medical record systems and want to enhance them with data from medical guidelines, how confident can I be that the LLM understands the proper relationships between diseases, symptoms, and medical encodings like ICD-10 to generate a sensible and accurate graph? I know that some SciSpacy models have been trained on biomedical data and could theoretically do a better job of extracting relevant medical entities and relationships. How can this be incorporated into current GraphRAG workflows? I was hoping Paco would discuss this more and possibly explain ways to improve the resultant knowledge graph, either using existing approaches (Microsoft GraphRAG or Neo4J Graph Builder) or other alternative methods.
insightful discussions!
That's a fantastic question from Ben: We can accomplish everything you've outlined using a straightforward software class, without the need for an agent.
its the same use case all over, but AI can take 'fuzzy' input! one simple example of superiority of agents even at current capability? web scraping
Great interview but man, you have to stop cutting your guest off mid sentence. It's rude and it hurts the flow.
Ao
0:25
😕 "promo sm"
Dvngeelrtt
Hello there! I recently stumbled upon your TH-cam-recommended video discussing Data Exchange Podcast. I was thoroughly impressed by your presentation. Your focus on developing self-discipline, staying motivated, and being consistent deeply resonates with the topics I cover on my own channel. Like you, I'm dedicated to empowering and inspiring my audience with practical guidance. Your unique perspective and clear communication style have convinced me to hit that subscribe button. Keep up the great work-you're definitely making a positive difference!
I hope : in oxide or somebody else, ansure all compute, network and storage get its own purposed very specific functions , maybe integrated FPGA alike ??
Really enjoy listening when Dmitriy explains how A.I. is being applied at Ginkgo. He breaks it down into bite-sized pieces that I am able to understand.
"Nothing is better than showing". I could not agree more! I worked two months on an MVP just to be able to show my target audience what I wanted to sell them. I found out someone was already doing what my app did but better. Oh well.
Good quantization and c p.u etc
Jbbbb
it is bit annoying to see the host interrupting the guest so often 😊
Greats guest..
Great discussion.
Very good conversation. Like every other technology, AI requires efforts to make it work. One can’t simply assume that it is like flicking a switch. This realisation is important for us to begin to realise the value at scale. Else there will be disappointment.
Excellent discussion. Totally agree it’s a revolution in communication. Also creation.
I really like that Sudhir showed open-minded opinion about the market trends, not just pitching Neo4j - where in fact Neo4j can be very important player for this wave on AI revolution because of mixing graphs and VBD capabilities.
🎉
Would love more podcasts on navigating, defending against, and combatting next generation identity theft, social engineering, misinformation/disinformation, etc. That’s honestly what keeps me up most at night regarding these rapidly developing technologies.
Great discussion.
Nice
Fascinating. Great listen!
😊😊😊 😊
😊
😊
Really very useful tips,very well done,get going,best wishes
Fkdhttnk😢😢😢😢
Very excited to try this. Great discussion.
'Promo sm'
Awesome! Super excited to listen to this!
Really great interview!
great interview. thanks!
It looks Python, it runs like C.
Ok
Bought the book after just a few minutes, in the chat about statistics. Just great! I wish Chris had been my prof at uni! Thank you Ben. Adding real value.
Very informative and interesting podcast. Loved watching, Keep it up👏🙏🙏
AI innovation. Aleph Alpha leads the way. Such a great watch.