Zipline - Airbnb's Declarative Feature Engineering Framework | Airbnb
ฝัง
- เผยแพร่เมื่อ 17 พ.ย. 2024
- ABOUT THE TALK (www.datacounci...)
ML Feature Engineering is a two-part problem:
1. Generating Realtime Features for Serving
2. Generating Historically Realtime Features for training
We will briefly introduce how we address the former problem. The primary focus of this talk is about generating Historically Realtime Features for training.
One way to think of generating Historically-Real-Time Features is to:
1.) Travel back in time to a particular state of the world, as represented by data in production systems
2.) Snapshot it, and
3.) Compute aggregations over the snapshot.
This is a useful visualization to understand the problem, but it is an intractable approach - especially in terms of compute and storage. We will introduce the algorithm in Zipline that makes backfilling features, with Historically Realtime values, feasible. We will borrow a few concepts from Abstract Algebra / Category Theory, but everything will be introduced from first principles.
ABOUT THE SPEAKER
Nikhil Simha is a Software Engineer on the Machine Learning infrastructure team at Airbnb. He is currently working on Bighead (DSAA '19), an end-to-end machine learning platform. Prior to Airbnb, he built a scheduler (Turbine, ICDE '20) and a stream processing framework (RealTime Data @ FB, SIGMOD '16) at Facebook. He is interested in the intersection of compilers, machine learning and realtime data processing systems. Nikhil got his Bachelors degree in Computer Science from Indian Institute of Technology, Bombay. While not working, he likes to boulder or play capoeira.
ABOUT DATA COUNCIL:
Data Council (www.datacounci...) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: / datacouncilai
LinkedIn: / datacouncil-ai
Facebook: / datacouncilai
Eventbrite: www.eventbrite... - วิทยาศาสตร์และเทคโนโลยี