Fitzmaurice, Laird & Ware (2011) - Applied Longitudinal Data Analysis
Naim Rashid
Lecture Objectives
By the end of this lecture, you will be able to:
Distinguish between marginal (GEE) and subject-specific (mixed-effects) approaches
Formulate random-intercept and random-slope models for longitudinal data
Understand how random effects induce covariance structures (CS, nonstationary)
Apply shrinkage/EBLUP concepts for subject-specific predictions
Choose between ML and REML for appropriate model comparisons
Note
Reading. FLW Ch. 8 (8.1-8.10). Assessed by: HW3 (LME with CS/AR(1)/UN covariance; marginal vs conditional interpretation) and the longitudinal-modeling component of the final project.
Roadmap
Section
Topics
Foundations
Why mixed-effects? RI vs RS models
Estimation
ML/REML, BLUPs, shrinkage
TLC Application
Data wrangling, model fitting, diagnostics
Model Selection
Comparing structures, boundary tests
Advanced Topics
Heterogeneous variances, AR(1), splines
Summary
Reporting checklist, practical guidance
Note
Pacing (2 class periods). Day 1 (Part I): foundations through BLUPs and shrinkage, plus the core RI/RS fits on the TLC data. Day 2 (Part II): heterogeneous variances, AR(1), diagnostics, testing, and practical guidance.
Prerequisites / recall
Before this lecture you should be able to:
Describe the CS, AR(1), and UN covariance structures and what each assumes (Ch. 7).
Interpret a marginal (population-average) regression coefficient.
Fit a basic linear model in R and read its summary() output.
Notation reference
Symbol
Meaning
\(Y_{ij}\)
response for subject \(i\) at occasion \(j\) (scalar)
\(\mathbf{Y}_i\)
response vector for subject \(i\)\((n_i \times 1)\)
\(\mathbf{X}_i,\ \boldsymbol\beta\)
fixed-effect design \(n_i \times p\) and coefficients
\(\mathbf{Z}_i,\ \mathbf{b}_i\)
random-effect design and random effects, \(\mathbf{b}_i \sim N(0, G)\)
Averaging the subject-specific mean over \(b_i \sim N(0, G)\) gives the marginal mean \(X_i\boldsymbol\beta\), so the LME also has a marginal interpretation. GEE is one way to target that marginal mean directly (with robust SEs); the LME targets it while also predicting subject trajectories and leveraging all available data under MAR.
Model Building Blocks
Let \(Y_{ij}\) be the response for subject \(i\) at time \(t_{ij}\).
This is a quadratic in \(t_{ij}\): trajectories fan out (or in) over time, unlike the constant variance of compound symmetry. This is the nonstationary, “linear-in-time” covariance shown in the RI+RS row of the Link to Chapter 7 table below (it is introduced here, not in Ch. 7).
RI vs RS: When and Why
Model
Use When
RI
Subjects have different baselines but common trajectory shape
RS
Subjects have different rates of change; trajectories fan in/out
If slopes vary across persons (common in biology), RS is often essential.
Two-Stage Random Effects Formulation (FLW 8.4)
FLW motivate the LME model in two stages, which makes clear where the random effects come from:
Stage 1 (within-subject):\(\mathbf{Y}_i = \mathbf{Z}_i \boldsymbol\beta_i + \boldsymbol\varepsilon_i\), with \(\boldsymbol\varepsilon_i \sim N(0, R_i)\). Each subject follows their own regression with coefficients \(\boldsymbol\beta_i\).
Stage 2 (between-subject):\(\boldsymbol\beta_i = \mathbf{A}_i \boldsymbol\beta + \mathbf{b}_i\), with \(\mathbf{b}_i \sim N(0, G)\). Here \(\mathbf{A}_i\) is a known between-subject design matrix linking each subject’s coefficients to the population means \(\boldsymbol\beta\) (often an identity, or a map of subject-level covariates). The subject-specific coefficients vary around \(\boldsymbol\beta\).
Substituting Stage 2 into Stage 1 gives the familiar single-equation form
The random-intercept and random-slope models are special cases: the two-stage view simply makes explicit that \(G\) is the between-subject covariance of the \(\mathbf{b}_i\).
The Link to Chapter 7
Random Effects
Implied Covariance
RI only
Compound Symmetry (CS)
RI + RS
Linear-in-time covariance (nonstationary)
Mixed-effects give parsimonious, interpretable\(\Sigma_i\) via random effects rather than directly parameterizing \(\Sigma_i\).
Check Your Understanding: Random Effects Basics
In the random-intercept model, what is the implied correlation between two observations \(Y_{ij}\) and \(Y_{ik}\) from the same subject?
When would you choose a random-slope model over a random-intercept model?
A random-intercept model is equivalent to which covariance pattern from Chapter 7?
Answers:
\(\text{Corr}(Y_{ij}, Y_{ik}) = \frac{\sigma_b^2}{\sigma_b^2 + \sigma^2}\). This is the intraclass correlation (ICC), representing the proportion of total variance attributable to between-subject differences.
When you expect subjects to have different rates of change over time, not just different starting points. Visual signs include trajectories that “fan out” or cross over time in spaghetti plots.
Compound Symmetry (CS). The random intercept induces constant correlation between any two time points for the same subject.
Estimation Overview
Method
Use For
ML
Comparing different fixed effects
REML
Comparing different random-effects structures (same fixed effects)
Mixed-model equations (Henderson) yield BLUPs of random effects; with unknown variance components they become EBLUPs (empirical BLUPs, the FLW term for predicted random effects: BLUPs computed using the estimated variance components rather than known ones).
Shrinkage Intuition (EBLUP)
Borrowing strength: subject-specific intercepts/slopes are pulled toward population means.
Shrinkage is stronger when:
Subject has few measurements
Measurement error \(\sigma^2\) is large
Random-effect variance is small
This prevents overfitting noisy within-subject patterns.
ggplot(tlc, aes(time_num, y, group = id, color = group)) +geom_line(alpha =0.25) +stat_summary(aes(group = group), fun = mean, geom ="line", linewidth =1.1) +labs(title ="Lead levels over time (TLC)", x ="Week", y ="Lead") +theme_minimal() +guides(color =guide_legend(title ="Group"))
TLC Visuals: Spaghetti Plot
Random-Intercept Model (RI)
lme_ri <-lme(y ~ group * time_f, random =~1| id, data = tlc, method ="REML")summary(lme_ri)$tTable[1:8, , drop =FALSE]
\(\sigma_b\) (StdDev) and \(\sigma_b^2\) (Variance)
Between-subject SD: How much subjects differ at baseline
Residual
\(\sigma\) (StdDev) and \(\sigma^2\) (Variance)
Within-subject SD: Variability around each subject’s trajectory
Derived quantities:
ICC (intraclass correlation) = \(\frac{\sigma_b^2}{\sigma_b^2 + \sigma^2}\) = proportion of variance due to between-subject differences
Total variance = \(\sigma_b^2 + \sigma^2\)
Example interpretation: If VarCorr shows (Intercept) StdDev = 8.5 and Residual StdDev = 5.2, then ICC = \(8.5^2/(8.5^2 + 5.2^2) = 72.25/99.29 \approx 0.73\). About 73% of the total variance is between subjects.
RI Model Specification
Fixed: group x time (factor) = maximal mean for covariance selection
Random: intercept only (\(b_{0i}\))
Random-Slope Model (RS)
A random slope needs numeric time (time_num) so each subject has a single slope coefficient. (We keep the factor time_f for the population mean in the next model; the consequence of mixing factor and numeric time for model comparison shows up on the AIC slide.)
ctrl <-lmeControl(niterEM =200,msMaxIter =200,msMaxEval =400,tolerance =1e-6,returnObject=TRUE)lme_rs <-lme(y ~ group * time_num, random =~ time_num | id, data = tlc, method ="REML", control = ctrl)summary(lme_rs)$tTable
Random coefficients allow individual-specific baseline and slope.
Note
ctrl (defined above via lmeControl) just relaxes nlme’s convergence settings and is reused via control = ctrl in the later fits. If a fit warns about convergence this usually helps. Note that returnObject = TRUE returns the fit even if it did not fully converge, so always confirm the fit converged before trusting it.
Not strictly comparable if fixed effects differ (time factor vs numeric): REML AIC compares random structures only when the fixed effects match (see Estimation Overview).
RI vs RS: AIC/REML Comparison
model AIC logLik
1 RI (time_f) 2480.621 -1230.311
2 RS (time_num) 2681.281 -1332.641
Factor-Time Fixed Effects with RS Random
Strategy: keep time as factor in fixed effects (rich mean), use time_num in random part to permit RS.
lme_rifs <-lme(y ~ group * time_f, random =~ time_num | id, data = tlc, method ="REML", control = ctrl)data.frame(model =c("RI", "RI+RS"),AIC =c(AIC(lme_ri), AIC(lme_rifs)),logLik=c(logLik(lme_ri), logLik(lme_rifs)))
Factor-Time Fixed Effects with RS Random
model AIC logLik
1 RI 2480.621 -1230.311
2 RI+RS 2471.239 -1223.620
Interpreting Fixed Effects with RS
Component
Interpretation
Fixed effects
Summarize average trajectory
Random effects
Capture individual departures
Interaction group x time
Difference-in-change averaged across subjects
EMMs for Contrasts at Planned Times
emm_ri <-emmeans(lme_ri, ~ group | time_f)pairs(emm_ri)
Same use of emmeans as in Ch. 7; SEs now reflect random-effects structure.
So what? A wide spread of these intercept BLUPs \(b_{0i}\) means substantial between-subject variation in baseline, consistent with a large \(\sigma_{b0}\) in VarCorr: the histogram is the picture of that variance component. (The QQ-plot slide recomputes re from the same fit.)
re_df <-as.data.frame(re)ggplot(re_df, aes(x =`(Intercept)`)) +geom_histogram(bins =20) +labs(title ="BLUPs of random intercepts", x =expression(b[0*i]), y ="Count") +theme_minimal()
Part II (Day 2): Extensions, Diagnostics, and Practice
Heterogeneous Residual Variances + RI/RS
Mixed models can combine random effects + a within-subject residual covariance \(R_i\):
Heterogeneous residual SDs via weights = varIdent(~ 1 | time_f) (a separate residual SD per visit, instead of one constant \(\sigma\))
Additional within-subject correlation via corAR1(~ time_num | id) (residual AR(1): correlation that decays with the time gap) if needed
lme_rifs_het <-lme(y ~ group * time_f, random =~ time_num | id,weights =varIdent(~1| time_f),data = tlc, method ="REML", control = ctrl)AIC(lme_rifs, lme_rifs_het)
lme_rifs_ar1 <-lme(y ~ group * time_f, random =~ time_num | id,correlation =corAR1(form =~ time_num | id),data = tlc, method ="REML", control = ctrl)data.frame(model =c("RI+RS", "RI+RS + AR(1)"),AIC =c(AIC(lme_rifs), AIC(lme_rifs_ar1)))
Only add AR(1) if random effects do not remove residual serial correlation.
Adding Residual AR(1) to RI/RS
model AIC
1 RI+RS 2471.239
2 RI+RS + AR(1) 2473.233
Diagnostics: Residuals vs Fitted
res <-resid(lme_rifs, type ="normalized")fit <-fitted(lme_rifs)ggplot(data.frame(fit, res), aes(fit, res)) +geom_point(alpha =0.5) +geom_hline(yintercept =0, linewidth =0.3) +labs(title ="Residuals vs fitted (normalized)", x ="Fitted", y ="Residual") +theme_minimal()
Normalized residuals are Cholesky-standardized: they strip out the modeled within-subject correlation, which is why we check these (not raw residuals) for a correlated-data model.
Diagnostics: Residuals vs Fitted
Diagnostics: ACF of Residuals
acf_out <-ACF(lme_rifs, resType ="normalized")ggplot(acf_out, aes(lag, ACF)) +geom_col(width =0.1) +geom_hline(yintercept =0, linewidth =0.3) +labs(title ="ACF of normalized residuals", x ="Lag", y ="ACF",subtitle ="Reading rule: bars within the band (flat) = good; slow decay = leftover serial correlation") +theme_minimal()
Diagnostics: ACF of Residuals
Diagnostics: Semivariogram
vg <-Variogram(lme_rifs, form =~ time_num | id, resType ="normalized")ggplot(vg, aes(dist, variog)) +geom_point() +geom_smooth(se =FALSE, span =0.9) +geom_hline(yintercept =1, linetype =3) +labs(title ="Semivariogram of normalized residuals", x ="Lag (weeks apart)", y =expression(hat(gamma)),subtitle ="Reading rule: flat near 1 = correlation captured; rising from a low value = consider AR(1)") +theme_minimal()
Diagnostics: Semivariogram
Diagnostics: Random-Effects QQ Plot
re <-ranef(lme_rifs)qq <-as.data.frame(re)ggplot(qq, aes(sample =`(Intercept)`)) +stat_qq() +stat_qq_line() +labs(title ="QQ plot: predicted random intercepts (BLUPs)", x ="Theoretical", y ="Sample") +theme_minimal()
Do NOT use this to test random-effects normality
The predicted random effects (empirical BLUPs) are shrunk toward 0, so their distribution understates the true random-effects variance and is not a valid check of the \(N(0, G)\) assumption (FLW p. 273). Use the BLUP QQ plot only to spot individual subjects with unusual profiles (outliers/influence), never as a normality test.
Diagnostics: Random-Effects QQ Plot
Testing Fixed Effects
Test Type
Method
Wald tests (z/F)
Large-sample z (FLW); KR/Satterthwaite df is a modern add-on
Likelihood-ratio tests
Between nested fixed-effects models (requires ML fits)
FLW Ch. 8 uses large-sample inference: Wald z-statistics and LRTs without a small-sample df adjustment (Tables 8.5 and 8.12 report z-statistics).
Warning
Beyond FLW (standard modern add-on). Kenward-Roger and Satterthwaite df correct Wald tests in small samples but are not part of FLW Ch. 8. nlme::lme (used here) provides neither: it reports containment df. For KR/Satterthwaite, fit with lme4::lmer and use lmerTest (Satterthwaite) or pbkrtest (Kenward-Roger). The LRT route below works directly in nlme once both models are refit with ML.
lme_rifs_ML <-update(lme_rifs, method ="ML")lme_rifs_noInt_ML <-lme(y ~ group + time_f, random =~ time_num | id, data = tlc, method ="ML", control = ctrl)anova(lme_rifs_noInt_ML, lme_rifs_ML)
Testing Fixed Effects
Model df AIC BIC logLik Test L.Ratio p-value
lme_rifs_noInt_ML 1 9 2578.232 2614.155 -1280.116
lme_rifs_ML 2 12 2481.167 2529.064 -1228.583 1 vs 2 103.0653 <.0001
Testing Variance Components (Boundary)
Variance-component tests are boundary tests (a variance cannot be negative), so the null distribution is a 50:50 mixture, not a plain \(\chi^2\). Which mixture depends on how many parameters you drop:
One variance (e.g. \(\sigma_{b0}^2 = 0\): is a random intercept needed at all?): \(\tfrac12\chi^2_0 + \tfrac12\chi^2_1\). Only here is the corrected p the naive \(\chi^2_1\) p-value halved.
A random slope, dropped from a model that already has a random intercept (our example below: removes \(\sigma_{b1}^2\)and the intercept-slope covariance \(\sigma_{b01}\), two parameters): \(\tfrac12\chi^2_1 + \tfrac12\chi^2_2\) (Stram & Lee). Not a simple halving.
lrt <-anova(lme_ri, lme_rifs) # REML; same fixed effects, drop the random slopelrt
The naive anova()\(\chi^2_2\) p-value is conservative (too large). The mixture above is the correct adjustment for this two-parameter test.
Note
FLW’s procedure (Section 8.5, p. 209). Compare the LRT statistic against the 50:50 chi-squared mixture critical values in FLW Appendix Table C.1. For comparing two correlated random effects (random intercept + slope) versus one (random intercept only) – exactly this random-slope test, the \(\tfrac12\chi^2_1 + \tfrac12\chi^2_2\) mixture – the \(\alpha = 0.05\) critical value is the \(q=1\) row of Table C.1, 5.14 (smaller than the naive \(\chi^2_2\) value of 5.99). FLW recommend \(\alpha = 0.10\) instead of \(0.05\) only for more complex comparisons (\(q\) vs \(q+k\) correlated random effects with \(k > 1\)), where the null distribution is less well understood. (Testing a single variance component at 0 is the \(\tfrac12\chi^2_0 + \tfrac12\chi^2_1\) case, \(\alpha = 0.05\) critical value about 2.71.)
Tip
Beyond FLW (modern extension). For exact inference on a single variance component, a parametric bootstrap of the restricted LRT is available via RLRsim::exactRLRT; pbkrtest offers a parametric bootstrap LRT. These are not the chapter’s method (FLW uses the tabulated mixture above), but they avoid the asymptotic mixture approximation.
Testing Variance Components (Boundary)
Model df AIC BIC logLik Test L.Ratio p-value
lme_ri 1 10 2480.621 2520.334 -1230.311
lme_rifs 2 12 2471.239 2518.894 -1223.620 1 vs 2 13.38185 0.0012
# Dropping the random slope removes TWO parameters: use the 1/2 chi^2_1 + 1/2 chi^2_2 mixtureLR <- lrt[["L.Ratio"]][2]p_mix <-0.5*pchisq(LR, 1, lower.tail =FALSE) +0.5*pchisq(LR, 2, lower.tail =FALSE)c(naive_chisq2 = lrt[["p-value"]][2], mixture_corrected = p_mix)
Check Your Understanding: Model Selection and Testing
You want to test whether the group * time_f interaction is significant. Should you use ML or REML for this comparison?
When testing whether to drop the random slope (i.e., testing \(\sigma_{b1}^2 = 0\)), why is the standard \(\chi^2\) test inappropriate?
A colleague compares two models: one with random = ~ 1 | id and one with random = ~ time | id, but the fixed effects differ (one uses time_f, the other uses time_num). Can they use REML AIC to compare these models?
Answers:
ML. Use ML when comparing models with different fixed effects. REML likelihoods are not comparable across models with different fixed-effects structures.
Testing \(\sigma_{b1}^2 = 0\) is a boundary test because the null places the parameter on the edge of its space (variance cannot be negative), so the null distribution is a 50:50 mixture, not a plain \(\chi^2\). Dropping the random slope from a model that already has a random intercept removes two parameters (\(\sigma_{b1}^2\) and the intercept-slope covariance \(\sigma_{b01}\)), so the correct mixture is \(\frac{1}{2}\chi^2_1 + \frac{1}{2}\chi^2_2\), not the standard \(\chi^2_2\). (Testing a single variance, e.g. whether to include a random intercept at all, gives \(\frac{1}{2}\chi^2_0 + \frac{1}{2}\chi^2_1\), where halving the p-value is exact.) The standard \(\chi^2\) test is conservative.
No. REML AIC is only valid for comparing models with identical fixed effects. Since the fixed effects differ, they cannot use REML AIC. They would need to either (a) make the fixed effects the same, or (b) use ML for both models if comparing fixed effects is the goal.
Beyond Linear Time: Splines
library(splines)lme_spl <-lme(y ~ group *ns(time_num, df =3),random =~ time_num | id, data = tlc, method ="REML", control = ctrl)AIC(lme_rifs, lme_spl)
Flexible mean over time; keep random slopes (or random spline coefficients if enough data).
Beyond Linear Time: Splines
df AIC
lme_rifs 12 2471.239
lme_spl 12 2465.472
Time-Varying Covariates (TVCs)
Mixed models handle TVCs naturally.
Watch out for endogeneity (covariate depends on past \(Y_{ij}\)).
Interactions with time (fixed and/or random slopes) often required.
Missing Data (MAR) and LMMs
Under MAR, ML/REML estimates remain valid with an appropriate mean + random-effects structure.
LMMs use all available data without ad hoc imputation.
Simulation: Shrinkage and EBLUPs
set.seed(8)N <-200; m <-3ids <-factor(seq_len(N))times <-0:2G <-matrix(c(1.0, 0.3, 0.3, 0.4), 2, 2) # random-effects covariance (FLW's G)sigma2 <-1.2^2# residual variancesim <-do.call(rbind, lapply(ids, function(i){ b <- MASS::mvrnorm(1, mu =c(0,0), Sigma = G) tibble::tibble(id = i, time = times,y =5+0.6*time + b[1] + b[2]*time +rnorm(length(times), sd =sqrt(sigma2)))}))fit <-lme(y ~ time, random =~ time | id, data = sim, method ="REML", control = ctrl)blup <-ranef(fit)head(blup)
Note BLUPs near 0 when subject has little information (shrinkage).
set.seed(81)ids_few <-sample(levels(sim$id), 6)pred <-expand.grid(id = ids_few, time =seq(0,2,by=.05))pred$pop <-predict(fit, newdata = pred, level =0)pred$ss <-predict(fit, newdata = pred, level =1)ggplot(sim %>% dplyr::filter(id %in% ids_few),aes(time, y, group = id)) +geom_point() +geom_line(aes(y = pop, group = id), data = pred, linetype =2) +geom_line(aes(y = ss, group = id), data = pred) +facet_wrap(~ id, ncol =3) +labs(title ="Shrinkage stronger with sparse/noisy subjects", x ="Time", y ="Y") +theme_minimal()
Visual: Shrinkage with Few vs Many Points
Check Your Understanding: Shrinkage and BLUPs
What are the three conditions that make shrinkage stronger (i.e., pull subject-specific estimates more toward the population mean)?
What is the difference between a BLUP and an EBLUP?
In predict(fit, level = 0) vs predict(fit, level = 1), which one incorporates the subject-specific random effects?
Answers:
Shrinkage is stronger when: (a) the subject has few measurements, (b) the residual error \(\sigma^2\) is large, and (c) the random-effect variance is small. In each case, subject-specific data is less informative, so we borrow more from the population.
BLUP (Best Linear Unbiased Predictor) assumes variance components are known. EBLUP (Empirical BLUP) uses estimated variance components from the data. In practice, we always compute EBLUPs since true variances are unknown.
level = 1 gives subject-specific predictions that include the BLUPs. level = 0 gives population-average predictions using only the fixed effects. The solid lines (level=1) in the shrinkage plot follow individual patterns; dashed lines (level=0) show the overall population trend.
Prediction and Intervals
Level
What It Uses
Population-level (level = 0)
Fixed effects only
Subject-specific (level = 1)
Includes BLUPs
Prediction intervals require accounting for both residual and random-effect uncertainty, so there is no one-line predict() interval for a subject-specific prediction. The standard route (stated as reference, not run here):
Note
Parametric-bootstrap prediction interval (recipe). (1) Simulate new random effects \(\mathbf{b}^\ast \sim N(0,\hat G)\) and residuals \(\varepsilon^\ast \sim N(0,\hat\sigma^2)\) from the fitted model; (2) form \(y^\ast = \mathbf{X}\hat{\boldsymbol\beta} + \mathbf{Z}\mathbf{b}^\ast + \varepsilon^\ast\) for the target subject and time; (3) repeat many times and take empirical quantiles. lme4::bootMer automates this for lmer fits; for nlme, simulate from VarCorr/ranef by hand.
Using standard \(\chi^2\) for variance component tests
Boundary problem: variance \(\geq 0\)
Use the right 50:50 mixture (\(\frac{1}{2}\chi^2_0+\frac{1}{2}\chi^2_1\) for one variance; \(\frac{1}{2}\chi^2_1+\frac{1}{2}\chi^2_2\) to drop a random slope), comparing to FLW Appendix Table C.1 critical values (a parametric bootstrap is a beyond-FLW option)
Not centering time in RS models
Hard-to-interpret intercept-slope correlation
Center at baseline or mid-study
Adding AR(1) without checking if RS is sufficient
Random slopes often absorb serial correlation
Check residual ACF after fitting RS first
Interpreting VarCorr “StdDev” as variance
Column shows standard deviation, not variance
Square the StdDev to get variance
Forgetting that BLUPs are shrunk
BLUPs are biased toward population mean
Use for prediction, not unbiased estimation
Extrapolating subject-specific predictions
BLUPs unreliable beyond observed data
Limit predictions to data range
RI/RS vs Pure Covariance Models (Ch. 7)
Many covariance patterns can be mimicked by random effects
Mixed models interpret\(\Sigma_i\) through heterogeneity in baseline/slopes
For prediction and patient-specific effects, mixed models are natural
Reporting Checklist (LMM)
Element
Details
Mean structure
Time coding, interactions, rationale
Random structure
RI/RS; covariance of random effects \(G\)
Residual structure
Heterogeneous SDs, AR(1) if any
Estimation method
ML/REML and model comparisons
Estimates
Fixed effects with CIs; variance components
Diagnostics
Residual ACF/variogram; BLUP plot for unusual subjects (not RE normality); influence
Sensitivity
Additional analyses
Quick Reference: nlme::lme Syntax
# Random-interceptlme(y ~ fixed_terms, random =~1| id, data = dat)# Random-slope (linear time)lme(y ~ fixed_terms, random =~ time | id, data = dat)# Add hetero residual SDs by visitlme(y ~ fixed_terms, random =~ time | id,weights =varIdent(~1| time_factor), data = dat)# Add residual AR(1)lme(y ~ fixed_terms, random =~ time | id,correlation =corAR1(~ time | id), data = dat)
The Same Model in SAS (PROC MIXED)
R is our runnable language (used in the slides above and in the homework). For students who work in SAS, here is the equivalent, shown for reference only and not executed. This mirrors FLW’s Computing section (8.9).
/* Not run here. SAS counterpart of the random-slope fit above, on the TLC data. */
PROC MIXED DATA=tlc METHOD=REML;
CLASS id trt;
MODEL y = time trt time*trt / SOLUTION;
RANDOM intercept time / SUBJECT=id TYPE=UN G;
RUN;
RANDOM ... / SUBJECT=id is the SAS counterpart of R’s random = ~ time | id (or (time | id) in lmer).
METHOD=REML matches the nlme::lme/lmer default; use METHOD=ML for LRTs of fixed effects.
TYPE=UN is an unstructured \(G\) (correlated intercept and slope); TYPE=VC makes them independent.
Quick Reference: ML vs REML
Goal
Method
Compare fixed effects
Refit both models with ML; LRT/AIC on ML
Choose random/residual structure
Fit with REML; compare REML AIC/LRT
Side Note: lme vs lmer
Package
Strengths
nlme::lme
Residual correlation structures and heteroscedastic weights; no built-in Satterthwaite
lme4::lmer
Fast, robust; pair with lmerTest for df; no built-in residual AR(1)/varIdent
Choose based on needs; results for fixed effects are often similar.
Worked Example: ML vs REML
fit_full_ML <-lme(y ~ group * time_f, random =~ time_num | id, data = tlc, method ="ML", control = ctrl)fit_noint_ML <-lme(y ~ group + time_f, random =~ time_num | id, data = tlc, method ="ML", control = ctrl)anova(fit_noint_ML, fit_full_ML)
Worked Example: ML vs REML
Model df AIC BIC logLik Test L.Ratio p-value
fit_noint_ML 1 9 2578.232 2614.155 -1280.116
fit_full_ML 2 12 2481.167 2529.064 -1228.583 1 vs 2 103.0653 <.0001
id pop ss
1 1 23.646 25.63256
2 2 20.762 22.22903
3 3 20.762 24.51969
4 4 23.646 22.34439
5 5 20.762 10.05191
6 6 20.762 11.78730
Model Criticism: Overdispersion/Heavy Tails
If residuals or random effects are heavy-tailed, consider:
Transformations, variance functions, robust LMMs (e.g., t-distributed random effects/errors)
Check sensitivity to outliers
Time as Factor vs Continuous: Guidance
Approach
Best For
Factor
Flexible at planned visits; good for EMMs at specific times
Continuous
Interpretable global slope; good with many time points or mistiming
Combined
Factor in fixed, continuous in random (RS) is often optimal
Growth Curves vs Spline LMMs
Polynomial growth curves are easy but can oscillate
Natural cubic splines are more stable
Choose df by AIC/CV and subject-matter considerations
TVCs and Lagged Effects
Incorporate lagged covariates if plausible causal ordering
Beware of post-treatment covariates when estimating treatment effects
Beyond Gaussian LMMs
GLMMs for non-Gaussian outcomes (binary, counts)
Similar principles; estimation via Laplace/AGQ
Not covered in this chapter
Cheat-Sheet: Model Choice
Question
Solution
Slopes differ across subjects?
Add RS
Residual SD varies by visit?
Add varIdent
Leftover serial correlation?
Add AR(1)
Need predictions per subject?
Use Mixed models
Only marginal contrasts?
Consider GEE as sensitivity
Cheat-Sheet: Testing
Test Type
Method
Fixed effects
ML for model comparison; large-sample Wald z / LRT (FLW). KR/Satterthwaite df is a modern add-on
Random effects
REML LRT vs the 50:50 mixture (FLW Appendix Table C.1; use \(\alpha = 0.10\) for \(q\)-vs-\(q+k\), \(k>1\)). Bootstrap is beyond FLW
Do not mix ML and REML across compared models.
What to Include in Your Report
Data structure and missingness; time coding
Final mean, random, residual structures; estimation method
Key estimates (fixed) with CIs; variance components
Diagnostics; sensitivity; clear interpretations
Recipe Card: Fitting Mixed-Effects Models
Step-by-Step Workflow for LMM Analysis
Step 1: Data Preparation
# Ensure data is in long format with:# - Subject ID as factor# - Time as both factor (for rich mean) and numeric (for slopes)dat <- dat %>%mutate(id =factor(id),time_f =factor(time),time_num =as.numeric(time),time_c = time_num -mean(time_num)) # centered
Step 2: Fit Candidate Models (REML)
# Start with random interceptfit_ri <-lme(y ~ group * time_f, random =~1| id,data = dat, method ="REML")# Add random slopefit_rs <-lme(y ~ group * time_f, random =~ time_num | id,data = dat, method ="REML")# Add heterogeneous residual variancesfit_rs_het <-update(fit_rs, weights =varIdent(~1| time_f))
Step 3: Compare Random Structures (REML AIC)
AIC(fit_ri, fit_rs, fit_rs_het) # Lower is better
Step 4: Check Diagnostics
plot(fit_rs) # Residuals vs fittedqqnorm(resid(fit_rs, type ="normalized"))ACF(fit_rs) # Check for leftover correlationqqnorm(ranef(fit_rs)[,1]) # BLUPs: spot unusual subjects, NOT a normality test (FLW p.273)
Step 5: Test Fixed Effects (refit with ML)
fit_full_ML <-update(fit_rs, method ="ML")fit_reduced_ML <-lme(y ~ group + time_f, random =~ time_num | id,data = dat, method ="ML")anova(fit_reduced_ML, fit_full_ML) # LRT for interaction
Mixed-effects = flexible, interpretable way to model individual trajectories and covariance
Shrinkage improves estimation for sparse/noisy subjects
Careful model comparison (ML vs REML) and boundary-aware tests are crucial
Further Reading and For Next Time
FLW Ch. 8: the Case Studies (8.8), the Computing section on PROC MIXED (8.9), and Further Reading (8.10).
Laird & Ware (1982), the original two-stage random-effects formulation behind 8.4.
Next: FLW Ch. 9, Fixed Effects versus Random Effects Models and the Hausman test (Lecture 9).
Note
A note on the data. TLC is our course running example (carried over from Ch. 5-7 for continuity), not a Ch. 8 dataset. FLW Ch. 8’s own case studies (Section 8.8) are the Six Cities log(FEV1) data, fit with hybrid random-effects + serial-correlation models, and the ACTG 193A log(CD4) counts, fit with a linear-spline LME whose empirical-BLUP trajectories appear in Fig. 8.10.