Trino for Large Scale ETL at Lyft

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 ก.พ. 2025
  • At Lyft, we are processing petabytes of data daily through Trino for various use cases. A single query can execute as long as 4 hours with terabytes of memory reserved. There are quite many challenges to operate Trino ETL at such a scale: how to make all queries as performant as possible with low failures rates; how should we define clusters, routing groups and resource groups for changing volume across a day; how to keep commitment to user SLOs during unexpected spikes, etc.
    Lyft shares what they've done with our config tunings, large query/user identifications, autoscaling and fault tolerant features to execute Trino at such a scale. We'd also like to share our upcoming challenges and plans to move steps further with Trino adoption across the company.

ความคิดเห็น • 1

  • @mark-w6s5p
    @mark-w6s5p 2 ปีที่แล้ว +1

    Trino is much faster than Hive or Spark on read queries.