Gaël Varoquaux: Prepping Tables for Machine Learning Gets Easier [PyData Südwest]

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ต.ค. 2024
  • Recorded live at PyData Südwest 27 June 2023 at Mathematikon, University of Heidelberg
    Skrub: Prepping Tables for Machine Learning Gets Easier
    Gaël Varoquaux, Research Director, Inria, France
    In standard data-science practice, a significant effort is spent on preparing the data before statistical learning. One reason is that the data come from various tables, each with its own subject matter, its specificities. These must be transformed to a format that can be injested by machine-learning modeled: assembled, aggregated, encoded. I will present some results from our research in developing machine-learning models that can more easily injest raw, messy data. I will also discuss how we are using this understanding to make a new software package that facilitate preparing tables for machine learning. It's called skrub, it's in progress, not released, but I'm excited!
    🙋🏼‍♂️ Skrub is not released, yet 👉 contributors are welcome!
    github.com/skr...
    Gaël Varoquaux is a research director working on data science at Inria (French Computer Science National research) where he leads the Soda teamon computational and statistical methods to understand health and society with data. Varoquaux is an expert in machine learning, with an eye on applications in health and social science. He develops tools to make machine learning easier, suited for real-life, messy data. He co-funded scikit-learn, one of the reference machine-learning toolboxes, and helped build various central tools for data analysis in Python. He currently develops data-intensive approaches for epidemiology and public health, and worked for 10 years on machine learning for brain function and mental health. Varoquaux has a PhD in quantum physics supervised by Alain Aspect and is a graduate from Ecole Normale Superieure, Paris.
    gael-varoquaux...
    Thanks to our sponsors:
    Hei_INNOVATION for hosting: www.uni-heidel...
    Königsweg for organising: www.koenigsweg...
    PyData Südwest
    👋 PyData Südwest is a dynamic community passionate about Data Science, AI, Data Management, 🐍 Python, and Open Source within enterprise and research!
    We're a diverse blend of tech enthusiasts hosting regular meetups in locations like Karlsruhe, Mannheim, and Heidelberg. Uniting members across Nordbaden and Kurpfalz, we run a large-scale meetup for enriching insights. 🤝🌈
    Our events are listed on Meetup: www.meetup.com...
    ----
    PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

ความคิดเห็น •