Voltron Data
Voltron Data
  • 40
  • 48 414
Querying Sales Data With Ibis
Patrick provides a quick overview of a small project he worked on to familiarize himself with Ibis. Using mock sales data, he constructs a sales data query from scratch using fStrings and Ibis expressions and then discusses how Ibis can be used to make parameterizing queries much easier.
More information: github.com/p-a-a-a-trick/ibis-sales-query
มุมมอง: 732

วีดีโอ

Powering Data-Centric AI with Arrow
มุมมอง 6012 ปีที่แล้ว
Speaker: Henry Ehrenberg, Co-Founder at Snorkel AI Snorkel AI recently adopted Arrow to help power Snorkel Flow, their data-centric development platform which helps enterprise data science teams build high quality training datasets and ML models quickly. In this talk, Henry will cover how the Snorkel AI team evolved their data and compute architecture to leverage Arrow. With Arrow under the hoo...
How to Use the New Contributor’s Guide to Start Contributing to Apache Arrow (Part 2 - Demo)
มุมมอง 2492 ปีที่แล้ว
Speaker: Alenka Frim, Open Source Apprentice at Voltron Data This is the demo component of another talk, which can be found here: th-cam.com/video/a-nhqPhYWGE/w-d-xo.html Original contribution in GitHub: github.com/apache/arrow/pull/13329 Resources: Apache Arrow: arrow.apache.org/ Project Documentation: arrow.apache.org/docs/ The New Contributor's Guide: arrow.apache.org/docs/developers/guide/i...
All in on Apache Arrow
มุมมอง 2.3K2 ปีที่แล้ว
Speaker: Randy Zwitch, Head of Developer Relations, Streamlit
The Data Thread Conference: Live Broadcast Sessions (Previously Recorded)
มุมมอง 1.1K2 ปีที่แล้ว
Recorded on June 23, 2022 Agenda: Welcome - Marlene Mhangami, Developer Advocate, Voltron Data Keynote - Apache Arrow co-creators Wes McKinney & Jacques Nadeau High Performance Computing Panel Discussion with moderator Jing Brewer, VP of Product Strategy, Voltron Data Fireside Chat with Peter Wang, CEO, Anaconda & Josh Patterson, CEO, Voltron Data DataStax Featured Talk with Sebastián Estévez Q...
Building the First GPU Visual Graph AI Platform with End to End Apache Arrow
มุมมอง 4822 ปีที่แล้ว
Speaker: Leo Meyerovich Founder, Graphistry, Inc.
When Data Engineering Meets Security Analytics
มุมมอง 2082 ปีที่แล้ว
Speaker: Matthias Vallentin, CEO and Co-Founder, Tenzir In this talk Matthias presents his group's highly pluggable C engine for security telemetry data that builds on top of Arrow. He show their wins where they can leverage drop-in functionality, as well where they face challenges.
Time Series Data Transformation with Arrow Compute Engine
มุมมอง 7232 ปีที่แล้ว
Speaker: Li Jin, Software Developer, Two Sigma
Using Arrow, with Numba KerneIs, to Generate AI Workflows
มุมมอง 4932 ปีที่แล้ว
Speaker: John Murray, Director, Fusion Data Science and Visiting Professor, Data Science Lab, University of Liverpool In this session, John demonstrates the use of the Numba Python compiler, to create custom kernel functions, on top of Arrow tables, to generate end to end AI workflows with TensorFlow (training), TensorRT (inference), and RAPIDS cuml (clustering). The talk is based on his group'...
Velox: An Open-Source Unified Execution Engine
มุมมอง 3.2K2 ปีที่แล้ว
Speaker: Pedro Pedreira, Software Engineer at Meta
Everyone Should Use Apache Arrow for Data Systems Research
มุมมอง 1K2 ปีที่แล้ว
Speaker: Andrew Crotty, Assistant Professor at Northwestern University The conventional wisdom is that it takes about a decade of effort to build a stable, full-featured data analytics system. Unfortunately, this type of systems work does not translate well to academic environments, where resources are more constrained and research progress (e.g., tenure review, funding duration) is typically m...
Why Apache Arrow is Important for Ruby
มุมมอง 3112 ปีที่แล้ว
Speaker: Sutou Kouhei, Co-Founder & President at ClearCode It is widely known in the data processing world that Apache Arrow is important. Sutou shares why Apache Arrow is important especially for Ruby community. He also introduces Apache Arrow features that the Ruby community is working on.
Arrow and Substrait: Better Together
มุมมอง 3.1K2 ปีที่แล้ว
Speaker: Ian Cook, Product Manager at Voltron Data Links from the slides: Substrait project: github.com/substrait-io/substrait DSLs producing Substrait: Python Ibis Substrait compiler: github.com/ibis-project/ibis-substrait R dplyr Substrait compiler: github.com/voltrondata/substrait-r SQL Substrait compiler (Isthmus): github.com/substrait-io/substrait-java/tree/main/isthmus Engines consuming S...
Accelerating Geospatial Computing in R and Python Using Apache Arrow
มุมมอง 1.6K2 ปีที่แล้ว
Speakers: Dewey Dunnington, Senior R Developer at Voltron Data and Joris Van den Bossche, Software Engineer at Voltron Data The Apache Arrow and Apache Parquet ecosystems provide a flexible and efficient in-memory and on-disk format for tabular data. With implementations in most languages, Apache Arrow supports a growing set of tools and analytical workflows. In this talk, Dewey and Joris intro...
Ibis and Substrait: Standardized Analytics
มุมมอง 7852 ปีที่แล้ว
Speaker: Hussain Sultan, Field Engineering Director at Voltron Data The code used in this demonstration can be found at gist.github.com/gforsyth/496d680e1e29f0876df937ee5091e1b8
Apache Arrow on the Web and Beyond
มุมมอง 1.3K2 ปีที่แล้ว
Apache Arrow on the Web and Beyond
Mainlining Databases : Supporting Fast Transactional Workloads on Apache Arrow
มุมมอง 2452 ปีที่แล้ว
Mainlining Databases : Supporting Fast Transactional Workloads on Apache Arrow
An Introduction to Arrow for Python Programmers
มุมมอง 4.1K2 ปีที่แล้ว
An Introduction to Arrow for Python Programmers
Put Your Cassandra Python Driver On Steroids With Apache Arrow
มุมมอง 3542 ปีที่แล้ว
Put Your Cassandra Python Driver On Steroids With Apache Arrow
How to Use the New Contributor’s Guide to Start Contributing to Apache Arrow (Part 1)
มุมมอง 2172 ปีที่แล้ว
How to Use the New Contributor’s Guide to Start Contributing to Apache Arrow (Part 1)
Navigating the San Francisco Art Scene with Ibis
มุมมอง 3752 ปีที่แล้ว
Navigating the San Francisco Art Scene with Ibis
A New Hope For The Big Data Divergence
มุมมอง 3392 ปีที่แล้ว
A New Hope For The Big Data Divergence
Microkernel Notebooks
มุมมอง 1762 ปีที่แล้ว
Microkernel Notebooks
Maximizing the Performance of DNA Analysis Using Apache Arrow
มุมมอง 1732 ปีที่แล้ว
Maximizing the Performance of DNA Analysis Using Apache Arrow
GraphQL and Apache Arrow: A Match Made in Data
มุมมอง 9562 ปีที่แล้ว
GraphQL and Apache Arrow: A Match Made in Data
A Developers' Journey Using Arrow with Tableau
มุมมอง 2312 ปีที่แล้ว
A Developers' Journey Using Arrow with Tableau
Apache Arrow and DataFusion: Changing the Game for Implementing Database Systems
มุมมอง 2.4K2 ปีที่แล้ว
Apache Arrow and DataFusion: Changing the Game for Implementing Database Systems
Torch Arrow Performant ML Preprocessing
มุมมอง 1K2 ปีที่แล้ว
Torch Arrow Performant ML Preprocessing
PyFroid: Scaling Data Preparation Using Database
มุมมอง 2972 ปีที่แล้ว
PyFroid: Scaling Data Preparation Using Database
What Is Ibis + Simple Demo
มุมมอง 2.8K2 ปีที่แล้ว
What Is Ibis Simple Demo

ความคิดเห็น

  • @multitaskprueba1
    @multitaskprueba1 3 หลายเดือนก่อน

    You are a genius! Fantastic video! Thanks!

  • @gloriamacia1120
    @gloriamacia1120 3 หลายเดือนก่อน

    amazing video!!

  • @pablomoretto8443
    @pablomoretto8443 4 หลายเดือนก่อน

    great explanation, thank you 👍🙏

  • @tamararodrigues3471
    @tamararodrigues3471 5 หลายเดือนก่อน

    Greaaaat video, thanks!!

  • @vikramsinhshinde8789
    @vikramsinhshinde8789 5 หลายเดือนก่อน

    This looks really promising!

  • @nndegwa1
    @nndegwa1 6 หลายเดือนก่อน

    Love it!

  • @pookiepats
    @pookiepats 8 หลายเดือนก่อน

    What is so funny?

  • @xkarika
    @xkarika 8 หลายเดือนก่อน

    Hi Matt - I think what you've done here is absolutely brilliant!!!! I think you solved the one major limitation of GraphQL that's been preventing it from taking over the data world. Have you thought about open-sourcing this?

  • @carvalhoribeiro
    @carvalhoribeiro 9 หลายเดือนก่อน

    Awesome presentation. Thanks for sharing this

  • @javierparra3234
    @javierparra3234 9 หลายเดือนก่อน

    This was really helpful, quite a full and easy to understand introduction, thanks!

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 10 หลายเดือนก่อน

    Tom, I am not clear what specific data (from nflverse) you choose to work with. In your code what exactly is "data" ??

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 10 หลายเดือนก่อน

    A brief discussion of what the parquet file format is and what the advantages are over regular flat files.

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 10 หลายเดือนก่อน

    An excellent case for Arrow and DuckDB

  • @AshishSharma-pm1dc
    @AshishSharma-pm1dc 11 หลายเดือนก่อน

    Thank you for the session. Is there a detailed documentation for JS support. The apache arrow site points to a blank page

  • @tarasst6887
    @tarasst6887 ปีที่แล้ว

    🎉🎉🎉😊

  • @zhitaoli4702
    @zhitaoli4702 ปีที่แล้ว

    Hi, is it possible to share link to the slides used the presentation? Thanks

  • @user-vp7wp7dt7m
    @user-vp7wp7dt7m ปีที่แล้ว

    Great video - thanks Tom!

  • @arturocdb
    @arturocdb ปีที่แล้ว

    Incredible useful thank you so much!…

  • @matattz
    @matattz ปีที่แล้ว

    So to summarize it, using SQL is getting old very fast! So would you suggest that a beginner should learn the very basics in SQL and focus more on ibis or now ponder for example? I don’t see why you would need SQL on a very high level when the playground is changing so rapidly nowadays. I get that many users with 7+ years of SQL knowledge are very frustrated that you basically could just use something like ibis but it is what it is

  • @pparsons12
    @pparsons12 ปีที่แล้ว

    Thank you! I enjoyed your presentation and learned a few important “missing pieces” in my understanding of how these tools can work together.

  • @tomanizer
    @tomanizer ปีที่แล้ว

    Great talk and great initiative. Could you point out how and where to find out when the arrow timeseries compute functions come online?

  • @jorgenengmann4856
    @jorgenengmann4856 ปีที่แล้ว

    super! thanks for this very useful tutorial.

  • @kamicheung4021
    @kamicheung4021 ปีที่แล้ว

    great video, thank you for such a detailed comparison

  • @umitekmekci503
    @umitekmekci503 ปีที่แล้ว

    I don't know much about arrow but if it is lazy the first timing can be wrong because arrow may not do the calculation until you need to use the output

  • @coolsameer9661
    @coolsameer9661 2 ปีที่แล้ว

    Huge thanks for this talk! And for uploading it :)

  • @user-gg5fc6yg9f
    @user-gg5fc6yg9f 2 ปีที่แล้ว

    Thank you Danielle Navarro !

  • @dasrotrad
    @dasrotrad 2 ปีที่แล้ว

    Super tutorial Danielle. Thank you.

  • @ibananti
    @ibananti 2 ปีที่แล้ว

    Very insightful talk, thank you!

  • @robinkohrs8097
    @robinkohrs8097 2 ปีที่แล้ว

    That looks fantastic! But what if I do not have my date as cleanly organzied in many "smaller" files, but rather one giant csv. Does arrow still have benefits?:)

  • @jayjeetchakraborty9542
    @jayjeetchakraborty9542 2 ปีที่แล้ว

    Can we replace Arrow Compute Engine (Acero) with Velox ?

  • @vishcanaran5139
    @vishcanaran5139 2 ปีที่แล้ว

    Thanks Tianyu, we are focused on a similar scaled down approach regarding transactional Arrow for production workloads and your talk and reference papers put it all together. Thank you again.

  • @twainyoung
    @twainyoung 2 ปีที่แล้ว

    thanks for sharing this video.😁

  • @koutousu
    @koutousu 2 ปีที่แล้ว

    If you have any questions, please ask me on Twitter! twitter.com/ktou (English is OK!)