Real-time Personalization with Kafka Streams and TensorFlow / Nemanja Milicević

แชร์
ฝัง
  • เผยแพร่เมื่อ 8 ก.ย. 2024
  • Welcome to Tech Internals Conf!
    The conference will be held on April 19, 2024 in Limassol, Cyprus.
    Program, details and tickets at internals.tech...
    --------
    Serbia 2023
    This talk explores real-time recommender systems from an engineering perspective. The talk will cover a wide range of topics, including an overview of recommender systems in general, a case study on online sports betting personalization with Kafka Streams and TensorFlow, as well as considerations for system operations and performance.
    While batch recommendations can be useful in certain contexts, there are many situations where real-time recommendations are necessary for optimal user experience. The talk will explore the differences between offline and online environments for recommender systems and will provide an overview of the different components that make up a recommender system, including retrieval, filtering, scoring, and ordering.
    The case study on online sports betting personalization with Kafka Streams and TensorFlow will be a major highlight of the talk. There will be a discussion of why Kafka Streams was chosen for this particular use case, as well as an exploration of alternatives such as Apache Flink. The talk will include an architecture diagram that outlines the various components of the system, including Apache Spark and TensorFlow for model training, MLflow as a model registry, and Apache Cassandra for serving recommendations. The talk will also cover considerations for the feature store, including the use of Apache Cassandra and Kafka Streams GlobalKTable abstraction with local RocksDB cache. Finally, there will be a discussion of how Apache Cassandra can be used for serving recommendations, including the use of Kafka Connect sink connectors.
    In addition to the case study, the talk will also cover various considerations for operations and performance in recommender systems. There will be a discussion of Kafka consumer rebalance, including adding/removing new instances and how standby state replicas can help us to alleviate this problem. Finally, there will be a discussion of deploying new model versions, feature data quality pipelines, and the importance of model monitoring and retraining.
    Overall, the talk will provide a comprehensive overview of real-time recommender systems from an engineering perspective and will be of interest to anyone interested in building intelligent systems that can handle large volumes of real-time data.

ความคิดเห็น •