Performance Testing and Comparative Benchmarking for Data.Table - Doris Afriyie Amoakohene

แชร์
ฝัง
  • เผยแพร่เมื่อ 17 ก.ย. 2024
  • The data.table package in R is a powerful tool for data analysis, combining efficient C code with user-friendly R syntax. To ensure its long-term sustainability, the NSF POSE program has funded a project from 2023 to 2025 to build a self-sustaining ecosystem around data.table. In this presentation, we will discuss the importance of performance testing in the development of data.table and present a general approach that can be applied to other R packages. By creating performance tests based on historical regressions, we can measure the package's efficiency over time and memory usage, ensuring that code and version releases do not impact its performance. We will demonstrate the use of the atime package to benchmark execution time and memory usage, providing developers with confidence in maintaining efficient performance and reliability. This approach not only benefits data.table but also serves as a model for other R package developers to enhance the performance and popularity of their own projects.
    Doris Afriyie Amoakohene, Northern Arizona Univeristy
    Doris holds a degree in BSc. Statistics and is currently pursuing a master's degree in Informatics at the Northern Arizona University. She is the Founder and CEO of LAG Prestige Foundation. Additionally, Doris is a Research Assistant in a Machine learning lab and actively involved in expanding the open-source ecosystem around data.table in R as part of her ongoing project. With a particular interest in informatics and research, she is also seeking collaborations in the field of data analytics.

ความคิดเห็น •