Filtering vs Enriching Data in Apache Spark

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ต.ค. 2024
  • Apache Spark provides lot of options of joining the data for its data sets. This talk will focus on comparing the approach of Enriching the data (left outer join) versus filtering the data(inner join).How both approaches end up with same result and highlight the merits of Enriching the data approach helped us in Capital One. We at CapitalOne are heavy users of Spark from its initial days.This talk will provide more details of how we evolved from filtering to Enriching the data for credit card transactions and highlight what benefits we got by following Enriching the data approach. Being the financial institution,we are bound by regulation.We need to back trace all credit card transactions processed through our engine.Will be providing the details on how Enriching the data approach solved us this requirement. This talk will provide more context on how financial institutions can use Enriching the data approach for their Spark workloads and back trace all the data they processed. We have used the filtering approach in Production and what were it issues and why we moved to Enriching the data approach in Production will also be covered in this talk. Attendees will be able to take away more details on Enriching and filtering options to decide on their use cases.It will be more relevant for users who wants to trace their data set processing with more granularity in Apache Spark.
    About:
    Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
    Read more here: databricks.com....
    Connect with us:
    Website: databricks.com
    Facebook: / databricksinc
    Twitter: / databricks
    LinkedIn: / databricks
    Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com...

ความคิดเห็น •