210
43 419

Rona Fang-Yu Hu "Integrating ‘Something Else’ Sexual Identity Responses in Health Disparity Studies"

40:42

Sean Taylor "Causal Discovery for Product Analytics"

53:10

Annie Collins, GivingTuesday, "Leveraging Data for Generosity"

56:22

Sana Shams and Waris Bhatia “Navigating Canada’s Largest Corpus of Government Documents”

39:16

Naman Jain - "LiveCodeBench: Holistic and contamination free evaluation of LLMs for code"

56:48

Terry Yue Zhuo "BigCodeBench: Benchmarking Code Generation"

1:03:52

Jay Alammar - "Hands-On Large Language Models: Language Understanding and Generation"

Friday 25 October 2024, noon (EDT)
Toronto Data Workshop
Jay Alammar, Cohere
“Hands-On Large Language Models: Language Understanding and Generation”
Jay Alammar is Director and Engineering Fellow at Cohere (pioneering provider of large language models as an API). In this role, he advises and educates enterprises and the developer community on using language models for practical use cases.
He recently released "Hands-On Large Language Models", available from O'Reilly: www.oreilly.com/library/view/hands-on-large-language/9781098150952/

มุมมอง: 319

วีดีโอ

Rona Fang-Yu Hu "Integrating ‘Something Else’ Sexual Identity Responses in Health Disparity Studies"

40:42

Rona Fang-Yu Hu "Integrating ‘Something Else’ Sexual Identity Responses in Health Disparity Studies"

มุมมอง 6121 วันที่ผ่านมา

Friday 11 October 2024, noon (EDT) Toronto Data Workshop Rona Fang-Yu Hu, University of Michigan “Predicting the Unobserved: Integrating ‘Something Else’ Sexual Identity Responses in Health Disparity Studies Using Machine Learning and Resampling Techniques” Rona Hu is a second-year Master’s student in the Michigan Program in Survey and Data Science. She graduated from National Chengchi Universi...

Sean Taylor "Causal Discovery for Product Analytics"

53:10

Sean Taylor "Causal Discovery for Product Analytics"

มุมมอง 25828 วันที่ผ่านมา

Friday 4 October 2024, noon (EDT) Toronto Data Workshop Sean Taylor, Motif “Causal Discovery for Product Analytics” I will discuss leveraging causal discovery in product analytics to uncover new insights and improve product development. First, I introduce a framework for categorizing causal questions, highlighting common challenges in product analytics, and emphasizing the need for discovering ...

Annie Collins, GivingTuesday, "Leveraging Data for Generosity"

56:22

Annie Collins, GivingTuesday, "Leveraging Data for Generosity"

มุมมอง 49หลายเดือนก่อน

Friday 27 September 2024, noon (EDT) Toronto Data Workshop - rohanalexander.substack.com/ Annie Collins, GivingTuesday “Leveraging Data for Generosity: GivingPulse and the GivingTuesday Data Commons” The American non-profit sector faces significant challenges in fully understanding the complex landscape in which it operates, particularly when it comes to accessing reliable data on individual gi...

Sana Shams and Waris Bhatia “Navigating Canada’s Largest Corpus of Government Documents”

39:16

Sana Shams and Waris Bhatia “Navigating Canada’s Largest Corpus of Government Documents”

มุมมอง 132หลายเดือนก่อน

20 September 2024 Toronto Data Workshop Sana Shams, University of British Columbia Waris Bhatia, University of British Columbia “Accessible Investigative Journalism: Navigating Canada’s Largest Corpus of Government Documents” Sana is a research fellow through UBC’s Data Science for Social Good Program. She is currently pursuing a BSc in cognitive systems with a minor in data science and is pass...

Naman Jain - "LiveCodeBench: Holistic and contamination free evaluation of LLMs for code"

56:48

Naman Jain - "LiveCodeBench: Holistic and contamination free evaluation of LLMs for code"

มุมมอง 953 หลายเดือนก่อน

Friday 12 July 2024, noon (EDT) Toronto Data Workshop Naman Jain, UC Berkeley “LiveCodeBench: Holistic and contamination free evaluation of large language models for code” In this talk we introduce LiveCodeBench, a comprehensive and contamination-free benchmark for LLMs in code, which continuously collects new problems from LeetCode, AtCoder, and CodeForces. LiveCodeBench evaluates a wide range...

Terry Yue Zhuo "BigCodeBench: Benchmarking Code Generation"

1:03:52

Terry Yue Zhuo "BigCodeBench: Benchmarking Code Generation"

มุมมอง 1193 หลายเดือนก่อน

Thursday 11 July 2024, 9am (EDT) Toronto Data Workshop Terry Yue Zhuo, Monash University “BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions” In this talk we introduce BigCodeBench, a benchmark that challenges LLMs to invoke multiple function calls as tools from 139 libraries and 7 domains for 1,140 fine-grained programming tasks. Our evaluation of 6...

Kobi Hackenburg "Evaluating the persuasive influence of political microtargeting with LLMs"

52:16

Kobi Hackenburg "Evaluating the persuasive influence of political microtargeting with LLMs"

มุมมอง 963 หลายเดือนก่อน

Friday 5 July 2024, noon (EDT) Toronto Data Workshop Kobi Hackenburg, Oxford Internet Institute, University of Oxford “Evaluating the persuasive influence of political microtargeting with large language models” Advances in LLMs have raised concerns over scalable, personalized political persuasion. In this talk, based on a paper recently published in PNAS, we integrate user data into GPT-4 promp...

Sasha Issenberg - “The Lie Detectives: In Search of a Playbook for Winning Elections”

54:17

Sasha Issenberg - “The Lie Detectives: In Search of a Playbook for Winning Elections”

มุมมอง 574 หลายเดือนก่อน

Friday 28 June 2024, noon (EDT) Sasha Issenberg “The Lie Detectives: In Search of a Playbook for Winning Elections in the Disinformation Age” Sasha Issenberg is a journalist and author of “The Lie Detectives: In Search of a Playbook for Winning Elections in the Disinformation Age” and four previous books, most recently “The Engagement: America’s Quarter-Century Struggle Over Same-Sex Marriage.”...

Jae Yeon Kim - "Field experimentation in the U.S. safety net"

1:02:50

Jae Yeon Kim - "Field experimentation in the U.S. safety net"

มุมมอง 814 หลายเดือนก่อน

14 June 2024 Toronto Data Workshop Jae Yeon Kim, SNF Agora Institute at Johns Hopkins University “Field experimentation in the U.S. safety net” Jae Yeon Kim is an incoming assistant research scientist at the SNF Agora Institute at Johns Hopkins University and a research fellow at the Center for Public Leadership at Harvard Kennedy School. Previously, he worked as a senior data scientist at Code...

Lars Vilhuber - Privacy protection in RCTs: The challenge of privacy protection in the field

53:58

Lars Vilhuber - Privacy protection in RCTs: The challenge of privacy protection in the field

มุมมอง 374 หลายเดือนก่อน

7 June 2024 Toronto Data Workshop Lars Vilhuber, Cornell University Lars Vilhuber holds a Ph.D. in Economics from Université de Montréal, Canada, and is currently on the faculty of the Cornell University Economics Department. He has interests in labor economics, statistical disclosure limitation and data dissemination, and reproducibility and replicability in the social sciences. He is the Data...

Ethan Busby "AI-Enabled Persuasion Research: Experimenting with Effective Political Messaging"

56:26

Ethan Busby "AI-Enabled Persuasion Research: Experimenting with Effective Political Messaging"

มุมมอง 685 หลายเดือนก่อน

Friday 31 May 2024, noon (EDT) Toronto Data Workshop Ethan Busby, Brigham Young University Ethan Busby is an Assistant Professor of Political Science at Brigham Young University, specializing in political psychology, extremism, artificial intelligence, and computational social science. His research relies on various methods, using lab experiments, quasi-experiments, survey experiments, text-as-...

Belinda Li - "Eliciting Human Preferences with Language Models"

55:23

Belinda Li - "Eliciting Human Preferences with Language Models"

มุมมอง 1305 หลายเดือนก่อน

24 May 2024 Toronto Data Workshop Belinda Li a PhD candidate at MIT CSAIL, affiliated with the language & intelligence (LINGO) lab @ MIT. Her work focuses on improving the human-interpretability, reliability, and usability of language models: examining and improving representations of both (objective) world states and (subjective) human preferences in language models. She is funded by an NDSEG ...

Amanda Coston - Addressing validity in decision-making algorithms

54:55

Amanda Coston - Addressing validity in decision-making algorithms

มุมมอง 795 หลายเดือนก่อน

Friday 10 May 2024, noon (EDT) Toronto Data Workshop Amanda Coston, Microsoft Research and Berkeley Amanda Coston is a Postdoc at Microsoft Research in the Machine Learning and Statistics Team. In fall 2024 she will join the Department of Statistics at UC Berkeley as an Assistant Professor. Her work considers how - and when - machine learning and causal inference can improve decision-making in ...

Kosuke Imai "Does AI help humans make better decisions?"

1:07:28

Kosuke Imai "Does AI help humans make better decisions?"

มุมมอง 1305 หลายเดือนก่อน

Toronto Data Workshop Friday 3 May 2024, noon (EDT) Kosuke Imai, Harvard University “Does AI help humans make better decisions? A methodological framework for experimental evaluation” Kosuke Imai is Professor in the Department of Government and the Department of Statistics at Harvard University. He is also an affiliate of the Institute for Quantitative Social Science where his office is located...

Abel Brodeur - Mass Reproducibility and Replicability: A New Hope

50:26

Abel Brodeur - Mass Reproducibility and Replicability: A New Hope

มุมมอง 1346 หลายเดือนก่อน

Abel Brodeur - Mass Reproducibility and Replicability: A New Hope

Lenny Bronner - Election Modeling at The Washington Post

36:23

Lenny Bronner - Election Modeling at The Washington Post

มุมมอง 2176 หลายเดือนก่อน

Lenny Bronner - Election Modeling at The Washington Post

Cameron Buckner - "The philosophy of Large Language Models"

43:36

Cameron Buckner - "The philosophy of Large Language Models"

มุมมอง 2217 หลายเดือนก่อน

Cameron Buckner - "The philosophy of Large Language Models"

Laura Plein - Can LLMs demystify Bug Reports and translate them into Test Cases?

31:21

Laura Plein - Can LLMs demystify Bug Reports and translate them into Test Cases?

มุมมอง 757 หลายเดือนก่อน

Laura Plein - Can LLMs demystify Bug Reports and translate them into Test Cases?

Tom Davidson - "Harnessing Generative Artificial Intelligence for Sociological Research"

40:23

Tom Davidson - "Harnessing Generative Artificial Intelligence for Sociological Research"

มุมมอง 1947 หลายเดือนก่อน

Tom Davidson - "Harnessing Generative Artificial Intelligence for Sociological Research"

Jonathan Mellon "Using LLMs to code open-text social survey responses at scale"

34:04

Jonathan Mellon "Using LLMs to code open-text social survey responses at scale"

มุมมอง 3137 หลายเดือนก่อน

Jonathan Mellon "Using LLMs to code open-text social survey responses at scale"

Matheus Facure "Why Banking has the Coolest Stats/Data Science Problems"

42:02

Matheus Facure "Why Banking has the Coolest Stats/Data Science Problems"

มุมมอง 3737 หลายเดือนก่อน

Matheus Facure "Why Banking has the Coolest Stats/Data Science Problems"

Sky CH-Wang - Do Androids Know They’re Only Dreaming of Electric Sheep?

38:21

Sky CH-Wang - Do Androids Know They’re Only Dreaming of Electric Sheep?

มุมมอง 338 หลายเดือนก่อน

Sky CH-Wang - Do Androids Know They’re Only Dreaming of Electric Sheep?

Richard Iannone - Using Great Tables to Make Presentable Tables in Python

28:39

Richard Iannone - Using Great Tables to Make Presentable Tables in Python

มุมมอง 6978 หลายเดือนก่อน

Richard Iannone - Using Great Tables to Make Presentable Tables in Python

Bradley Congelio - Introduction to NFL Analytics with R

33:30

Bradley Congelio - Introduction to NFL Analytics with R

มุมมอง 3008 หลายเดือนก่อน

Bradley Congelio - Introduction to NFL Analytics with R

Oliver Giesecke - AI at the Frontiers of Economic Research

49:21

Oliver Giesecke - AI at the Frontiers of Economic Research

มุมมอง 1269 หลายเดือนก่อน

Oliver Giesecke - AI at the Frontiers of Economic Research

Kristina Gligorić - "In-class Data Analysis Replications: Teaching Students while Testing Science"

35:18

Kristina Gligorić - "In-class Data Analysis Replications: Teaching Students while Testing Science"

มุมมอง 1039 หลายเดือนก่อน

Kristina Gligorić - "In-class Data Analysis Replications: Teaching Students while Testing Science"

Gregory Zuckerman - "Renaissance, data, and Wall Street"

47:16

Gregory Zuckerman - "Renaissance, data, and Wall Street"

มุมมอง 5029 หลายเดือนก่อน

Gregory Zuckerman - "Renaissance, data, and Wall Street"

Wendy Foster - Socio-technical processes for data integrity

53:18

Wendy Foster - Socio-technical processes for data integrity

มุมมอง 4111 หลายเดือนก่อน

Wendy Foster - Socio-technical processes for data integrity

Exploring Alternatives to REST for Accessing Public Data Sets

24:35

Exploring Alternatives to REST for Accessing Public Data Sets

มุมมอง 13611 หลายเดือนก่อน

Exploring Alternatives to REST for Accessing Public Data Sets

ความคิดเห็น

@coolvania 12 วันที่ผ่านมา
#BasedRohan - your accent makes the content that much more enjoyable
@surfbort 28 วันที่ผ่านมา
Thanks for sharing great talk
@kingwsd 3 หลายเดือนก่อน
Thank you for sharing!😁
@Jandodev 7 หลายเดือนก่อน
I think i'm going to give my format for BRX a go on the SWE-bench!
@landinnes5925 8 หลายเดือนก่อน
Promo-SM
@gilescain8903 8 หลายเดือนก่อน
🙌 *PromoSM*
@tomthetitan101 9 หลายเดือนก่อน
Great talk
@murraysondergard3210 10 หลายเดือนก่อน
Your new book is excellent! I am really enjoying it.
@CanDoSo_org ปีที่แล้ว
19:30: The chalkboard is so great. But how to make this effective in HTML and PDF versions?
@yussifmohammed9324 2 ปีที่แล้ว
Thanks Mine and Rohan, Thats really insightful
@itumelengmosala513 2 ปีที่แล้ว
Beautiful
@mphandejohn 2 ปีที่แล้ว
Beautiful indeed
@PWijekoon 2 ปีที่แล้ว
Thank you Mine. Just started to use quarto. This is very helpful.
@cynthiahuang7393 2 ปีที่แล้ว
01:18 RMarkdown -> Quarto + live demo 04:16 RStudio Visual Editor -- removes some cognitive load of learning RMarkdown + coding, forward slash "/" to insert anything 05:35 Quarto can render RMarkdown even if you just change the file extension 06:10 Shared features -- output: --> format:, live rendering, citations (from DOI search!), in-line R code 12:20 Quarto specific -- YAML style chunk options (great for alt text) but mixed style works!; global chunk options in yaml header; code-link: true 16:18 Visual Editor helps with avoiding minutiae of markdown syntax -- alt-text, captions, links, tab-sets 18:13 Change output from html to reveal.js slides -- `chalkboard: true` adds annotation tool; built-in presentation tools 20:14 more Quarto features-- R and python, support in other text editors, projects 24:15 Q&A
@jbloit 2 ปีที่แล้ว
This is dope.
@larsschobitz4830 2 ปีที่แล้ว
Thank you Mine and Rohan for sharing the talk with us. I was intrigued by Quarto and now am off trying out all the incredibly useful new functionalities that come with it. I am also now finally convinced that I should start teaching with the use of the RStudio Visual Editor.
@randyhill4617 2 ปีที่แล้ว
This is decent!!! If I was you I would employ Promo-SM!!
@lacorreia65 2 ปีที่แล้ว
Great Talk, Thank you Nathan!
@AsrifYusoff 2 ปีที่แล้ว
Great insights. Subscribed! Let us know what you think of our grad school content.
@larsschobitz4830 3 ปีที่แล้ว
This was fantastic. Thank you, Rohan, for organizing the workshop and making this material publicly available to us. And thank you, Mine, for another insightful talk that I have learned from a lot for my own teaching.
@ericthegreat7805 3 ปีที่แล้ว
Could an aggregate statistic (maybe proxied by age, disability rate and covid rate, and rurality vs urbanization) be used as a proxy for mail in votes as a predictor?
@ericthegreat7805 3 ปีที่แล้ว
What about the possibility of doing a statistical test of polling efficiency, defined by a one sided t test on the sample size being 30 x 338? 30 is usually the minimum number for normal convergence of a data set, and a "historical reliability" for each polling company, how well they attempt to achieve parity between ridings. Then weigh them by their deviation from this statistic. For example, take W = mu - n*, where n* = 30 x 338, and mu as the sample size. Then W is t distributed. Then compute the statistic W/sigma_t, where sigma_t is variance by time and between ridings, with greater variances being penalized, as a measure of poll reliability, and as a weight for each poll.
@ericthegreat7805 3 ปีที่แล้ว
Interesting to hear your second guest agrees on the value of MRP for a regional effects analysis.
@bennyblanco2523 3 ปีที่แล้ว
Lol, the actual number of people per class is NOT an opinion.
@ushnishsengupta9331 3 ปีที่แล้ว
Could not attend live, but loved this talk and its deep insights into Data Projects
@xiaoranzhang4445 4 ปีที่แล้ว
for the GitHub creation, do we need to purchase a plan? when I sign up with free plan, it just keeps verifying my account...
@yusufu9 4 ปีที่แล้ว
Thanks for posting this, quite helpful. If you have time, could you answer these questions for me? If you record a session, do the students get to see the session, or is it only for the moderator/prof? Second, when you upload a text file, how can you enlarge it on your screen? I can only get about 35% of a page to appear, framed by an excessively large sea of black background.

Rohan Alexander

ความคิดเห็น