I put a lot of effort into this one to make it as descriptive as possible. It's also a new style of delivering content / animation. Please let me know how you like this. :)
Gemini: The video is about causal inference. It explains what causal inference is and the challenges of performing causal inference using observed data. It also explains different techniques to address these challenges. The video starts with explaining randomized controlled trials (RCTs) which is the gold standard for causal inference. But RCTs are not always possible. So the video talks about causal inference using observed data. Causal inference using observed data is challenging because there can be confounding variables that affect both the treatment and the outcome. The video uses an example of a medical trial for the flu cure to illustrate this point. In the example, age is a confounding variable. The treatment group (people who received the elixir) has an average age of 35 while the control group (people who did not receive the elixir) has an average age of 65. Even if the people in the treatment group recover from the flu faster than the people in the control group, it might be because they are younger, not because of the elixir. Another challenge of causal inference using observed data is selection bias. Selection bias happens when the group chosen for the treatment is not representative of the population. For example, if the people who received the elixir in the medical trial were all young and healthy people, then the results of the trial would not be generalizable to the whole population. The video also talks about counterfactuals, which are what would have happened if a person had not received the treatment. Counterfactuals are necessary to estimate the causal effect of the treatment. There are two techniques for estimating counterfactuals: matching and machine learning. Matching involves finding people in the control group who are similar to the people in the treatment group on all observable characteristics except for the treatment. The outcome of the people in the control group can then be used as an estimate of the counterfactual for the people in the treatment group. Machine learning can also be used to estimate counterfactuals. A machine learning model can be trained on data from people who did not receive the treatment. The model can then be used to predict what would have happened to the people in the treatment group if they had not received the treatment. The video then talks about the assumptions that need to be made for causal inference using observed data. These assumptions are necessary to make the analysis possible. One of the assumptions is called the causal Markov condition. This assumption says that the treatment only affects the outcome through the variables that are included in the causal graph. Another assumption is called SUTVA (Stable Unit-Treatment Value Assumption). This assumption says that the outcome of a unit would be the same no matter what treatment the other units receive. The last assumption is called ignorability. This assumption says that there are no confounding variables that have not been included in the analysis. The video then shows how to calculate the average treatment effect (ATE) and the conditional average treatment effect (CATE). The ATE is the average difference in the outcome between the treatment group and the control group. The CATE is the average treatment effect for a specific subgroup of the population. In the example of the medical trial, the ATE was 0.1. This means that the people who received the elixir were more likely to recover from the flu than the people who did not receive the elixir. However, the CATE for people over the age of 35 was 0.4, while the CATE for people under the age of 35 was -0.2. This means that the elixir was effective for older people but not for younger people. The video concludes by saying that causal inference using observed data can be a powerful tool for making decisions, but it is important to be aware of the challenges and assumptions involved.
I’m taking a masters in data analytics/program evaluation, and am learning this rn. You summarize the information really well, picking out the really important parts of causal inference to explain. Good job! The later part of the video even helped me conceptualize quasi experimental designs, which use matching like you described. Thanks for the help.
Absolutely beautiful, incredible explanation; I like that it's explained through a practical example! You're very underrated; the future of this channel is bright!
Very useful video. I spent two days reading the actual paper of causal influence. This video is concise but gives me a very good foundation to read the theory.
Amazing explanation! It must've been almost painful to not discuss all the details and caveats and technicalities, but that's what made it valuable for me Love the music as well :D
All of calculations are simple and clear but there is lack of a key element, which you mention at 11:31, namely how to estimate missing data. Could you send a link to an explanation of this element of the presentation?
Thanks a lot for this video! Keep up the good work, and please try to cover Causal Graphs (Directed Acyclic Graphs) vs Bayesian Network structure learning(also in detail) if you can. Thanks in advance.
Your presentation is missing a key element, which you mention at 11:31, namely how to estimate missing data. Could you send a link to an explanation of this element of the presentation?
hello, what is the problem with the following approach which aims to account for age without counterfactuals? you can do mean(treatment) - mean(control) for the older group ((0+1+1)/3 = .67) - ((1+0)/2 = .5) resulting in a difference of .17 for the older group and a similar calculation for the younger group yields ((1+0)/2 = .5) - ((1+0+0)/3 = .33) resulting in a difference of .17 for the younger group as well. using this approach, there does not seem to be a difference due to age!
great explanation, i've been studying c.i. for the past 6 months and your way of explaining was very clear. Cheers from Bolivia. P.S. can you share your discord link again plz
Hey Ajay, thanks a lot for making this video. Super helpful. Best video I came across on Causal inference. I have a question regarding Balanceness check between treatment and control group. Is it necessary to satisfy the balance criteria if I am using a ML model to predict the counterfactuals? Is it okay if there’s no balance between some confounders in Treatment and control group? Would really appreciate helping with this.
at 10:06 you mention that the age differences was large enough to warrant age to be labeled as a confounding variable. what exactly was the magnitude of difference that leads to that assumption? if the age means were 35 and 40, would that be a large enough difference? thanks.
wow!!!! your explaination is better than my epidemiology professor. thanks a lot!!! By the way, is there any recommand paper for RCT design r about Causal Inference ?
Thank you! As for specific resources, i put them in the description of the video. I don't think there is a single research paper that is the one size fits all for the topic, but a collection of these resources does paint a good picture. Also the next video's description had other resources from a Machine Learning perspective
The counterfactuals seem questionable... Is it really reasonable to say Sam would not get better with the treatment if he did get better without the treatment? That seems highly unlikely, doesn't it?...and the inverse for Rondo seems highly unlikely as well... I'm admittedly clueless about statistics but I'm always on the lookout for bad logic and this was a red flag for me. I don't mean to suggest a bad example on your part but rather that, in general, it seems there is a huge opening for error to sneak in through counterfactuals.
I am a causality denier! I don't believe in causality. At least not the causality that we are familiar with. I think we need higher-order logic of at least the 69th degree to come up with an explanation for causality. I don't wear a tinfoil hat. I wear a quantum metamaterial protective helmet.
I put a lot of effort into this one to make it as descriptive as possible. It's also a new style of delivering content / animation. Please let me know how you like this. :)
Love it!
that was good man! well done!
would love to see the playlist on causal inferencing
Oh my god I love this video so much!!!!!
Thank you for the video! Have exam o causal inference soon, this is very helpful
Gemini: The video is about causal inference. It explains what causal inference is and the challenges of performing causal inference using observed data. It also explains different techniques to address these challenges.
The video starts with explaining randomized controlled trials (RCTs) which is the gold standard for causal inference. But RCTs are not always possible. So the video talks about causal inference using observed data.
Causal inference using observed data is challenging because there can be confounding variables that affect both the treatment and the outcome. The video uses an example of a medical trial for the flu cure to illustrate this point. In the example, age is a confounding variable. The treatment group (people who received the elixir) has an average age of 35 while the control group (people who did not receive the elixir) has an average age of 65. Even if the people in the treatment group recover from the flu faster than the people in the control group, it might be because they are younger, not because of the elixir.
Another challenge of causal inference using observed data is selection bias. Selection bias happens when the group chosen for the treatment is not representative of the population. For example, if the people who received the elixir in the medical trial were all young and healthy people, then the results of the trial would not be generalizable to the whole population.
The video also talks about counterfactuals, which are what would have happened if a person had not received the treatment. Counterfactuals are necessary to estimate the causal effect of the treatment. There are two techniques for estimating counterfactuals: matching and machine learning.
Matching involves finding people in the control group who are similar to the people in the treatment group on all observable characteristics except for the treatment. The outcome of the people in the control group can then be used as an estimate of the counterfactual for the people in the treatment group.
Machine learning can also be used to estimate counterfactuals. A machine learning model can be trained on data from people who did not receive the treatment. The model can then be used to predict what would have happened to the people in the treatment group if they had not received the treatment.
The video then talks about the assumptions that need to be made for causal inference using observed data. These assumptions are necessary to make the analysis possible. One of the assumptions is called the causal Markov condition. This assumption says that the treatment only affects the outcome through the variables that are included in the causal graph.
Another assumption is called SUTVA (Stable Unit-Treatment Value Assumption). This assumption says that the outcome of a unit would be the same no matter what treatment the other units receive.
The last assumption is called ignorability. This assumption says that there are no confounding variables that have not been included in the analysis.
The video then shows how to calculate the average treatment effect (ATE) and the conditional average treatment effect (CATE). The ATE is the average difference in the outcome between the treatment group and the control group. The CATE is the average treatment effect for a specific subgroup of the population.
In the example of the medical trial, the ATE was 0.1. This means that the people who received the elixir were more likely to recover from the flu than the people who did not receive the elixir. However, the CATE for people over the age of 35 was 0.4, while the CATE for people under the age of 35 was -0.2. This means that the elixir was effective for older people but not for younger people.
The video concludes by saying that causal inference using observed data can be a powerful tool for making decisions, but it is important to be aware of the challenges and assumptions involved.
I’m taking a masters in data analytics/program evaluation, and am learning this rn. You summarize the information really well, picking out the really important parts of causal inference to explain. Good job! The later part of the video even helped me conceptualize quasi experimental designs, which use matching like you described. Thanks for the help.
same hahahha, walking your path rn
Absolutely beautiful, incredible explanation; I like that it's explained through a practical example!
You're very underrated; the future of this channel is bright!
Brother, you have summarized really well in such a short video. Every second was GOLD 🙂
Best video I've seen on this topic and I've sen MANY.
Absolutely beautiful and incredible explanation. I also like the fact that it's explained through a practical example :)
Cheers!!
Very useful video. I spent two days reading the actual paper of causal influence. This video is concise but gives me a very good foundation to read the theory.
That’s the hope! Thanks a ton for watching
Amazing explanation! It must've been almost painful to not discuss all the details and caveats and technicalities, but that's what made it valuable for me
Love the music as well :D
Incredibly well explained and very illustrative examples. Many thanks for the work you put on it.
This content is gold. Thank you so much for making these kinds of videos!! Can’t wait to see more!!
My prof needed 3h to explain this (and failed - thats why im here). Thanks for the video, helped a lot!
reading a paper utilizing causal inference rn - this cleared so much up, props!
Really good video - appreciate the effort that it must have taken to convey the concepts intuitively whilst being as succinct as possible. Not easy!
Great video. I read a lot of materials and couldn't digest. This one is the best I saw.Thanks
Glad it was helpful! And thanks for watching!
I think this is a really good overview of Causal Inference and the main assumptions! Good DAG explanation as well!
Thanks so much for watching ! And the comment
“Heterogeneity” has *seven* (7) syllables. Try again. That “e-i-t-y” is three (3) of those 7 syllabuses, for a hint.
Amazing explanation! Got to learn a lot and understood everything. Thanks a lot!
I think the job you doing in this videos is better than some research papers ,by simplifying this topics for the public
.
Thank you! Much appreciated
Thank you so much, your explanation is way more clear than my prof in this class
Super glad you feel this way! Thanks so much for watching!
How do you create the Treatment and Control groups at 8:55?
Thank you for this exceptionally well-presented video - rich in content and succinct.
This is a really really really well done video, thank you!
All of calculations are simple and clear but there is lack of a key element, which you mention at 11:31, namely how to estimate missing data. Could you send a link to an explanation of this element of the presentation?
Thanks a lot for this video! Keep up the good work, and please try to cover Causal Graphs (Directed Acyclic Graphs) vs Bayesian Network structure learning(also in detail) if you can. Thanks in advance.
Your presentation is missing a key element, which you mention at 11:31, namely how to estimate missing data. Could you send a link to an explanation of this element of the presentation?
Great video! what tool do you use to create the presentation and the animations?
hello, what is the problem with the following approach which aims to account for age without counterfactuals?
you can do mean(treatment) - mean(control) for the older group ((0+1+1)/3 = .67) - ((1+0)/2 = .5) resulting in a difference of .17 for the older group and a similar calculation for the younger group yields ((1+0)/2 = .5) - ((1+0+0)/3 = .33) resulting in a difference of .17 for the younger group as well.
using this approach, there does not seem to be a difference due to age!
Great video. small correction, pronunciation is causal and not caushal
Great video. Thank you for sharing!
Clear and good explanation.
An idea. Can you consider a video comparing this to Bayesian network.
great explanation, i've been studying c.i. for the past 6 months and your way of explaining was very clear.
Cheers from Bolivia.
P.S. can you share your discord link again plz
Thank you! Appreciated! The discord link should be on the description of this video :)
Hey Ajay, thanks a lot for making this video. Super helpful. Best video I came across on Causal inference.
I have a question regarding Balanceness check between treatment and control group.
Is it necessary to satisfy the balance criteria if I am using a ML model to predict the counterfactuals? Is it okay if there’s no balance between some confounders in Treatment and control group? Would really appreciate helping with this.
Great summary thank you🙏🏾
Could you please upload a separate dedicated tutorial teaching us how to say "efficacy" the way you do!
Haha. I shall put that in my todos :)
Does including the confounder variable in a multiple regression also “control” for false association.
Yeah, I have the same question. But the matching approach seems very usual in biostatistics. I am still figuring their difference.
at 10:06 you mention that the age differences was large enough to warrant age to be labeled as a confounding variable. what exactly was the magnitude of difference that leads to that assumption? if the age means were 35 and 40, would that be a large enough difference? thanks.
You have 2 distribution of ages. You can conduct a statistical test to check if the 2 distributons are different. If significant, then yes.
Would age in this case be an effect modifier?
Great video. 👏
Thank you!
Awesome job explaining!!!!
wow!!!! your explaination is better than my epidemiology professor. thanks a lot!!! By the way, is there any recommand paper for RCT design r about Causal Inference ?
Thank you! As for specific resources, i put them in the description of the video. I don't think there is a single research paper that is the one size fits all for the topic, but a collection of these resources does paint a good picture. Also the next video's description had other resources from a Machine Learning perspective
@@CodeEmporium Thank u so much ! I learn a lot from your channel
Well explained.
Can causal inference be applied to time series data?
Yep. It can and it often is
Sir, can you make a video on using DeepSpeed on Pytorch Cifar10? How to implement it on it.
I can look onto this and see what the most palatable format is for a video. Thanks for the suggestion
Thank you so much.
The counterfactuals seem questionable... Is it really reasonable to say Sam would not get better with the treatment if he did get better without the treatment? That seems highly unlikely, doesn't it?...and the inverse for Rondo seems highly unlikely as well...
I'm admittedly clueless about statistics but I'm always on the lookout for bad logic and this was a red flag for me.
I don't mean to suggest a bad example on your part but rather that, in general, it seems there is a huge opening for error to sneak in through counterfactuals.
Your voice is soo ..beautiful 😍
Thanks
Thanks!
Your face being a distraction is some sort of error, you should release one with it to find the counterfactual
"Control other effect through randomisation"
Nice
gold
caushl? caushl?
Caw-zal
David Cross, is that you?
50% recovered, RIP the other 50% elderly 😭
This is missing statistical testing. All these are potentially non significant marginal results.
does this guy not know how to pronounce "causal"?
caw-zhul
I am a causality denier! I don't believe in causality. At least not the causality that we are familiar with. I think we need higher-order logic of at least the 69th degree to come up with an explanation for causality. I don't wear a tinfoil hat. I wear a quantum metamaterial protective helmet.
Aliens are real