This was EXACTLY what I was looking for: an intuitive explanation to how BLUPs of MLMs are derived with examples and none of the complicated jargon in textbooks. Thank you sooo much!!!!
1) Is weighting by the number of observations pretty standard? There are some recommendations for applying weight based on variance (people with more variability are given a lower weight) - but it doesn't seem like there is much guidance on what should be standard practice in the field (not mentioned explicitly in the recent Meteyard & Davies 2020 paper). 2) I've run a 2SSS on a sample that didn't converge as a mixed model (so it's not simple to pull G or sigma2). I'd like to determine the weights for each participant, but I'm not entirely sure how to compute a single w/n subject variance term for all subjects, especially since I have multiple predictors. Is there guidance on estimating a single variance term in 2SSS, when mixed models are not available? I was trying to go back to basics, by computing variances, beta's etc. via their formulas (e.g., (X'X)^-1*X'Y for beta). - but got stuck trying to determine how to compute a single term for all participants.
Hi, as far as I've read it is very common to only weight by number of observations (in non-fMRI mixed models). The reference I'm using here is commonly used and it follows lmer. Not sure if you're in the field of neuroimaging, but one of our anlaysis tools (FSL) can allow variance terms to be estimated within-group, which would make sense if you think about patients and controls (patients probably are more variable), but I have noticed that even with 100 subjects per group the variance over all subjects seems to be better than within-group. It depends on how noisy your data are and how well you can estimate a variance with very little data. On the other hand, the same software does have within-subject residual variance estimates, but there are often hundreds of data points within-subject and it seems to work well. When your mixed model didn't converge did you try to use a different optimization option? The bobyqa optimizer tends to work very well. If that's not working, then there may be something off with the scaling of the data. With 2sss you're basically limited to assuming the within-subject variance is the same for all subjects, so the subjects will not be differentially weighted at all. When I must use 2SSS, which I tend to only do when a mixed model can't answer my question (and not due to convergence errors), I only proceed if the data are fairly balanced between subjects. Again, with convergence errors I'd keep working to try and fix it. I've just discovered the allFit function in lme4 (must have newer version) and it is really handy in checking the different optimizers and comparing results between them. Does that answer your question?
@@mumfordbrainstats Thank you for the response! Yes, development of some of our scripts were based on neuroimaging studies early-on and we're trying to determine if these designs are still appropriate for our behavioral tasks. We did try using bobyqa and subsequent rescaling after that was not successful. We did try removing covariates from the model, but I'm starting to go through guidelines for when convergence doesn't occur to make sure we followed all the appropriate steps. We're running a two-part model (log on zero vs non-zero data and linear on non-zero data) so I believe balance is likely an issue for us. I'll definitely take a look at the allFit function. Thanks again!
@@tcdizz387 FWIW, I was helping somebody here who had the same issue with tons of zeros and we ended up doing the same thing :) I don't recall if we also had convergence errors with all the data (it may have not been a mixed model, but linear regression). Very frustrating that none of the solutions worked for you. I figured you probably tried them, but had to mention it just in case.
lme4::ranef documentation says it's extracting modes. The lme4 vignette says in the description for the `fitted` function, "Fitted values given conditional modes" and for the `coef` function, "Sum of the random and fixed effects for each level" all of which is confusing. Michael Clark's Mixed Models With R has a section that calls the `coef` output random intercepts and slopes which are the intercept or slope + random effects which he says is the output of `ranef`. The vignette implies that the conditional modes are either the random intercepts/slopes or the random effects and this series describes them as the fitted values. So, in conclusion - ¯\_(ツ)_/¯.
I dropped it a few times in the text, but I referred to the estimates I was working with as the conditional mode-based subject predictions, apologies if that was confusing when I dropped "based". This would be the sum of the conditional mode estimate and the intercept (fixed effect) in this case. The common term, BLUP doesn't make a ton of sense because how is it "best", "linear", "unbiased", as argued by Douglas Bates (2010 version of the lme4 book linked below)? Random effects are not parameters, but random variables which are described by a distribution. I like the term "conditional mode" because it is an estimate of the mode of the conditional distribution, which is the best we can do for a random effect. Since this estimate only describe the "wiggle" about the overall mean (how that subject relates to the population mean), if one wants to use it to look at a within-subject mean estimate, you'd need to add that "wiggle" to the intercept. Hope that clears things up. Conditional-mode based within-subject estimates = sum of fixed effects intercept and conditional mode estimate. Conditional mode estimate=estimated mode of the distribution that describes the random effect. See section 1.6 here for more details www.researchgate.net/profile/Dimitris-Kavroudakis/post/What_is_the_appropriate_package_to_use_for_performing_NML_using_R/attachment/59d62f0dc49f478072e9f5e7/AS%3A272534858076166%401441988781179/download/lrgprt.pdf
This was EXACTLY what I was looking for: an intuitive explanation to how BLUPs of MLMs are derived with examples and none of the complicated jargon in textbooks. Thank you sooo much!!!!
Hi, please could you show an example on how to calculate the BLUPS for random intercept and slope model? What is the formula and how do you derive it?
1) Is weighting by the number of observations pretty standard? There are some recommendations for applying weight based on variance (people with more variability are given a lower weight) - but it doesn't seem like there is much guidance on what should be standard practice in the field (not mentioned explicitly in the recent Meteyard & Davies 2020 paper). 2) I've run a 2SSS on a sample that didn't converge as a mixed model (so it's not simple to pull G or sigma2). I'd like to determine the weights for each participant, but I'm not entirely sure how to compute a single w/n subject variance term for all subjects, especially since I have multiple predictors. Is there guidance on estimating a single variance term in 2SSS, when mixed models are not available? I was trying to go back to basics, by computing variances, beta's etc. via their formulas (e.g., (X'X)^-1*X'Y for beta). - but got stuck trying to determine how to compute a single term for all participants.
Hi, as far as I've read it is very common to only weight by number of observations (in non-fMRI mixed models). The reference I'm using here is commonly used and it follows lmer. Not sure if you're in the field of neuroimaging, but one of our anlaysis tools (FSL) can allow variance terms to be estimated within-group, which would make sense if you think about patients and controls (patients probably are more variable), but I have noticed that even with 100 subjects per group the variance over all subjects seems to be better than within-group. It depends on how noisy your data are and how well you can estimate a variance with very little data. On the other hand, the same software does have within-subject residual variance estimates, but there are often hundreds of data points within-subject and it seems to work well.
When your mixed model didn't converge did you try to use a different optimization option? The bobyqa optimizer tends to work very well. If that's not working, then there may be something off with the scaling of the data. With 2sss you're basically limited to assuming the within-subject variance is the same for all subjects, so the subjects will not be differentially weighted at all. When I must use 2SSS, which I tend to only do when a mixed model can't answer my question (and not due to convergence errors), I only proceed if the data are fairly balanced between subjects. Again, with convergence errors I'd keep working to try and fix it. I've just discovered the allFit function in lme4 (must have newer version) and it is really handy in checking the different optimizers and comparing results between them. Does that answer your question?
@@mumfordbrainstats Thank you for the response! Yes, development of some of our scripts were based on neuroimaging studies early-on and we're trying to determine if these designs are still appropriate for our behavioral tasks.
We did try using bobyqa and subsequent rescaling after that was not successful. We did try removing covariates from the model, but I'm starting to go through guidelines for when convergence doesn't occur to make sure we followed all the appropriate steps. We're running a two-part model (log on zero vs non-zero data and linear on non-zero data) so I believe balance is likely an issue for us. I'll definitely take a look at the allFit function. Thanks again!
@@tcdizz387 FWIW, I was helping somebody here who had the same issue with tons of zeros and we ended up doing the same thing :) I don't recall if we also had convergence errors with all the data (it may have not been a mixed model, but linear regression).
Very frustrating that none of the solutions worked for you. I figured you probably tried them, but had to mention it just in case.
lme4::ranef documentation says it's extracting modes. The lme4 vignette says in the description for the `fitted` function, "Fitted values given conditional modes" and for the `coef` function, "Sum of the random and fixed effects for each level" all of which is confusing. Michael Clark's Mixed Models With R has a section that calls the `coef` output random intercepts and slopes which are the intercept or slope + random effects which he says is the output of `ranef`. The vignette implies that the conditional modes are either the random intercepts/slopes or the random effects and this series describes them as the fitted values. So, in conclusion - ¯\_(ツ)_/¯.
I dropped it a few times in the text, but I referred to the estimates I was working with as the conditional mode-based subject predictions, apologies if that was confusing when I dropped "based". This would be the sum of the conditional mode estimate and the intercept (fixed effect) in this case. The common term, BLUP doesn't make a ton of sense because how is it "best", "linear", "unbiased", as argued by Douglas Bates (2010 version of the lme4 book linked below)? Random effects are not parameters, but random variables which are described by a distribution. I like the term "conditional mode" because it is an estimate of the mode of the conditional distribution, which is the best we can do for a random effect. Since this estimate only describe the "wiggle" about the overall mean (how that subject relates to the population mean), if one wants to use it to look at a within-subject mean estimate, you'd need to add that "wiggle" to the intercept. Hope that clears things up. Conditional-mode based within-subject estimates = sum of fixed effects intercept and conditional mode estimate. Conditional mode estimate=estimated mode of the distribution that describes the random effect. See section 1.6 here for more details www.researchgate.net/profile/Dimitris-Kavroudakis/post/What_is_the_appropriate_package_to_use_for_performing_NML_using_R/attachment/59d62f0dc49f478072e9f5e7/AS%3A272534858076166%401441988781179/download/lrgprt.pdf