Jake VanderPlas - Performance Python: Seven Strategies for Optimizing Your Numerical Code

Jiaqi Liu - Building a Data Pipeline with Testing in Mind - PyCon 2018

Carol Willing - Practical Sphinx - PyCon 2018

"ธัญพร" พลิกชนะเลือกตั้ง นายก อบจ.สุรินทร์ แซงแชมป์เก่า นับคะแนนช่วงท้าย | Thai PBS News

Part1🎂 ในวันเกิดของเธอ เค้กของเด็กสาวถูกแย่งไป และดวงตาของเธอถูกทำร้าย #shorts #cdrama #Chinesedrama

ปริศนาเกาะหิมะ มรณะเเท่นขุดน้ำมัน! | Dredge DLC 100%

Alex Petralia - Analyzing Data: What pandas and SQL Taught Me About Taking an Average - PyCon 2018

PyCon 2018

มุมมอง 11 492

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 28 พ.ย. 2024

ความคิดเห็น • 17

@bigbred6 4 ปีที่แล้ว ⁺⁵
For the whole talk I was waiting for the speaker to spring the realization on his audience that there were no trades on Wednesday but unfortunately it didn't happen. There were no observed trades on Wednesday so the sum for Wednesday is 0 and would bring down the average. The set of days has to be an independent input as we aren't guaranteed to have trades every day. We would likely need the full set of exchanges as there may be an exchange that is infrequently used.
@gileslangdon1624 6 ปีที่แล้ว ⁺¹¹
This has highlighted the problem of why a simple question turns into a complex task. I would argue that even the results shown here are answering different questions than asked as they do not take into account the null results.(days where no apples were sold).
The question answered in the example was: for days a seller sold at least one apple what was the daily average of apples sold for each seller?
This is where you need to be aware of business practices happening in the real world. (Does the apple shop only open 3 days or is it 5/6/7 does it open on holidays or do you just want an average over a complete time period month/qtr/year). The main observation is that you can't rely on the original dataset given to give your 'collapsing key'. Sorry if this was obvious.
@simonmasters3295 6 ปีที่แล้ว
Not at all obvious. Very good points @giles Langdon. Business rules like Saturday and Sunday are "no sales days" may imply they don't contribute to N. Also a Monday-Friday with zero sales volume should trigger N+=1 with Volume = 0 (i.e. pulls down the average).
On multi-dimensional data sets it can also be important to consider things like "Bob only makes sales on the days he is at work, i.e. a user scoring 0 sales on a day, may or may not contribute to N depending on whether they showed up or not...
His concepts of "Collapsing Key" and "Grouping Key" and "Observation Key" are personal terms (I have not encountered them before) and he describes how they are useful.
@simonmasters3295 6 ปีที่แล้ว
@@wkxue3826 I get ya bro. You might use
if (salesman.day.sales = NaN) then 0 else 1 as #salesday.total
i.e. make a hashtable (temporary totals by day table)
I'm linking Python these days for its dataframe concept...
@vilkoos 4 ปีที่แล้ว ⁺³
Nice example of how analysis can go wrong if we do not get the basics right. So what basically goes wrong here?
The observational units were misidentified.
In this case the question is about an aggregate of the rows in the table, not on the individual rows themselves.
The solution is straightforward. Construct a table of aggregate data first, then answer the question using the new table.
To do that we do not need a magic formula with three subtly different keys.
Just spent some time on identifying the units you are trying to process.
Use aggregation, selection and joining etc to get a table that has the desired observational unit in each row.
Use this table to answer the question posed.
Hint: be sure you understand the concept of observational unit (IMHO, what was missing here is a clear understanding of this concept)
en.wikipedia.org/wiki/Unit_of_observation
@zgrunschlag 5 ปีที่แล้ว ⁺⁶
A decent talk if you aren't familiar with SQL GROUP BY. For people comfortable with the concept, it's not very valuable.
@adorablecheetah2930 5 ปีที่แล้ว ⁺⁷
This surely could have been conveyed in 5 mins.. brevity people!!
@mihalynemes4243 3 ปีที่แล้ว ⁺¹
Jane sold 16 apples in 3 days. Why is her average daily amount 16?
@aiexplainai2 6 ปีที่แล้ว ⁺⁵
besides present it nicely to give the conceptual name collapsing key and grouping key to help understand the basic.. honestly, I think it's a really basic concept for even an entry-level analyst...
@simonmasters3295 6 ปีที่แล้ว
totally Wendao - but let's not criticise
@MMphego 6 ปีที่แล้ว
Who would have thought...
Enjoyed the talk - well executed and concise...
@tripkendall 6 ปีที่แล้ว ⁺³
Very good, thanks for sharing. A lil' more Pandas and it coulda' been great :)
@bharatggaikwad 6 ปีที่แล้ว ⁺²
Thanks a lot Alex
SQl/Pandas ->Formula
Inner_Collapsingkey - Outer_GroupingKey = Implicit_ObservationKey
collapsingKey==primaryKey(default)
Amazing talk.
@simonmasters3295 6 ปีที่แล้ว ⁺¹
hardly...
@mateuszszkubel1530 6 ปีที่แล้ว ⁺⁴
Im no expert, but I feel like the meth kicked in. @17:30
Also it seems like you're unnecessarily reinventing ' df.set_index.'
@ronaldokun 6 ปีที่แล้ว
Amazing talk. This seemingly simple concepts rapidly get complicated up in greater dimensions.

ต่อไป

เล่นอัตโนมัติ

Jake VanderPlas - Performance Python: Seven Strategies for Optimizing Your Numerical Code

Jake VanderPlas - Performance Python: Seven Strategies for Optimizing Your Numerical Code

Jiaqi Liu - Building a Data Pipeline with Testing in Mind - PyCon 2018

Jiaqi Liu - Building a Data Pipeline with Testing in Mind - PyCon 2018

Carol Willing - Practical Sphinx - PyCon 2018

Carol Willing - Practical Sphinx - PyCon 2018

"ธัญพร" พลิกชนะเลือกตั้ง นายก อบจ.สุรินทร์ แซงแชมป์เก่า นับคะแนนช่วงท้าย | Thai PBS News

"ธัญพร" พลิกชนะเลือกตั้ง นายก อบจ.สุรินทร์ แซงแชมป์เก่า นับคะแนนช่วงท้าย | Thai PBS News

Part1🎂 ในวันเกิดของเธอ เค้กของเด็กสาวถูกแย่งไป และดวงตาของเธอถูกทำร้าย #shorts #cdrama #Chinesedrama

Part1🎂 ในวันเกิดของเธอ เค้กของเด็กสาวถูกแย่งไป และดวงตาของเธอถูกทำร้าย #shorts #cdrama #Chinesedrama

ปริศนาเกาะหิมะ มรณะเเท่นขุดน้ำมัน! | Dredge DLC 100%

ปริศนาเกาะหิมะ มรณะเเท่นขุดน้ำมัน! | Dredge DLC 100%

伪装成一棵树整蛊妹妹，结果妹妹当场怀疑人生竟要揍我？【两只马儿-恶搞姐妹】

伪装成一棵树整蛊妹妹，结果妹妹当场怀疑人生竟要揍我？【两只马儿—恶搞姐妹】

Ned Batchelder - Big-O: How Code Slows as Data Grows - PyCon 2018

Ned Batchelder - Big-O: How Code Slows as Data Grows - PyCon 2018

Larry Hastings - Solve Your Problem With Sloppy Python - PyCon 2018

Larry Hastings - Solve Your Problem With Sloppy Python - PyCon 2018

Running & Rolling Total | Moving & Rolling Average | SQL Key Data Analysis Concepts

Running & Rolling Total | Moving & Rolling Average | SQL Key Data Analysis Concepts

SQL Databases with Pandas and Python - A Complete Guide

SQL Databases with Pandas and Python - A Complete Guide

Greg Price - Clearer Code at Scale: Static Types at Zulip and Dropbox - PyCon 2018

Greg Price - Clearer Code at Scale: Static Types at Zulip and Dropbox - PyCon 2018

Nathaniel J. Smith - Trio: Async concurrency for mere mortals - PyCon 2018

Nathaniel J. Smith - Trio: Async concurrency for mere mortals - PyCon 2018

Hillel Wayne - Beyond Unit Tests: Taking Your Testing to the Next Level - PyCon 2018

Hillel Wayne - Beyond Unit Tests: Taking Your Testing to the Next Level - PyCon 2018

Alvaro Leiva Geisse - Systemd: why you should care as a Python developer - PyCon 2018

Alvaro Leiva Geisse - Systemd: why you should care as a Python developer - PyCon 2018

James Bennett - A Bit about Bytes: Understanding Python Bytecode - PyCon 2018

James Bennett - A Bit about Bytes: Understanding Python Bytecode - PyCon 2018

🔴 นกแดงโรคจิตเปิดบริสุทธิ์ชุดขาว พวกพรี่ๆ กระดี๊กระด๊ากันใหญ่ !!! | บอบู๋ Official

🔴 นกแดงโรคจิตเปิดบริสุทธิ์ชุดขาว พวกพรี่ๆ กระดี๊กระด๊ากันใหญ่ !!! | บอบู๋ Official

路飞不应该去天堂的吧！ #路飞#海贼王

路飞不应该去天堂的吧！ #路飞#海贼王

#快成长计划 #年轻影画创作之星办法总比困难多，还是儿子有办法。#乡村幽默#家庭趣事#搞笑创作#农村风情#幽默生活

#快成长计划 #年轻影画创作之星办法总比困难多，还是儿子有办法。#乡村幽默#家庭趣事#搞笑创作#农村风情#幽默生活

Part3🏥หลังออกจากโรงพยาบาล เด็กสาวถูกคนบ้าทำร้าย แต่พ่อของเธอกลับไม่รู้เรื่องเลย#shorts #Chinesedrama

Part3🏥หลังออกจากโรงพยาบาล เด็กสาวถูกคนบ้าทำร้าย แต่พ่อของเธอกลับไม่รู้เรื่องเลย#shorts #Chinesedrama

ร้องเพลงสั่งข้าว Ver.โอ้เธอช่าง... - บี้เดอะสกา | Feat @jamsaijs @ramer.official #ร้องเพลงสั่งข้าว

ร้องเพลงสั่งข้าว Ver.โอ้เธอช่าง... - บี้เดอะสกา | Feat @jamsaijs @ramer.official #ร้องเพลงสั่งข้าว

ทุกครั้งที่หลับตา (Lucid Dream) - AYLA's [ Official MV ]

ทุกครั้งที่หลับตา (Lucid Dream) - AYLA's [ Official MV ]

ยางจัดฟัน สีประเทศไทย‼️เหมือนมั้ย?? #jamsai #แจ่มใส #jamsaijs #จัดฟัน

ยางจัดฟัน สีประเทศไทย‼️เหมือนมั้ย?? #jamsai #แจ่มใส #jamsaijs #จัดฟัน