Beyond Parquet and ORC: Upgrading Data Infrastructure for Multi-modal AI with Lance Col... Chang She

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 พ.ย. 2024
  • Beyond Parquet and ORC: Upgrading Data Infrastructure for Multi-modal AI with Lance Columnar Format - Chang She, LanceDB
    AI has the potential to become ubiquitous across enterprises. This brings a host of new challenges to data lakehouses. Existing formats like parquet and ORC are not well suited for ML/AI workloads like vector search, deep learning, model evaluations, EDA for unstructured data, and more. Lance is a new open source columnar format that offers more than an order of magnitude better performance for AI workloads and is optimized for modern storage options. In this talk we'll motivate the new format, talk briefly about how it's designed, and see how it can be applied AI workloads to offer significantly better performance with nearly zero additional effort.

ความคิดเห็น •