Understanding the Fixed and Random Effects
Source: LEMMA online multilevel modelling course (University of Bristol)
loading libraries and data for this session
rm(list=ls())
setwd("D:/_Lab_Data/R/R_Labs")
library(lme4); library(lattice)
data1<- "HLM/Sample.txt"
mydata <- read.table(file = data1, sep = ",", header = TRUE)
head(mydata)
## caseid schoolid score cohort90 female sclass schtype schurban schdenom
## 1 18 1 0 -6 1 2 0 1 0
## 2 17 1 10 -6 1 2 0 1 0
## 3 19 1 0 -6 1 4 0 1 0
## 4 20 1 40 -6 1 3 0 1 0
## 5 21 1 42 -6 1 2 0 1 0
## 6 13 1 4 -6 1 2 0 1 0
A multilevel model of attainment with school effects
nullmodel <- lmer(score ~ (1 | schoolid), data = mydata, REML = FALSE)
summary(nullmodel)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: score ~ (1 | schoolid)
## Data: mydata
##
## AIC BIC logLik deviance df.resid
## 286545.1 286570.4 -143269.5 286539.1 33985
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.9763 -0.7010 0.1017 0.7391 3.0817
##
## Random effects:
## Groups Name Variance Std.Dev.
## schoolid (Intercept) 61.02 7.812
## Residual 258.36 16.073
## Number of obs: 33988, groups: schoolid, 508
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 30.6006 0.3694 82.83
Variance partition coefficient (VPC)
61.02/319.38 = 0.19
which indicates that 19% of the variance in attainment can be attributed to differences between schools.
Testing the school effects (residuals)
fit <- lm(score ~ 1, data = mydata)
summary(fit)
##
## Call:
## lm(formula = score ~ 1, data = mydata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.095 -12.095 1.905 13.905 43.905
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 31.09462 0.09392 331.1 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 17.31 on 33987 degrees of freedom
logLik(nullmodel); logLik(fit)
## 'log Lik.' -143269.5 (df=3)
## 'log Lik.' -145144.4 (df=2)
likelihood ratio test:
LR = 2(-143269.5 - -145144.4) = 3750 on 1 d.f.
Bearing in mind that the 5% point of a chi-squared distribution on 1 d.f. is 3.84,
there is overwhelming evidence of school effects on attainment. We will therefore
revert to the multilevel model with school effects
Examining the school effects (residuals)
u0 <- ranef(nullmodel, postVar = TRUE)
## Warning in ranef.merMod(nullmodel, postVar = TRUE): 'postVar' is
## deprecated: please use 'condVar' instead
x<-u0[1] # list
u0_ranef<-u0[[1]] # data.frame
u0se<-attr(u0[[1]],"postVar") # vector
schoolid <- as.numeric(rownames(u0_ranef))
u0tab <- cbind(schoolid, u0_ranef[,1], u0se)
colnames(u0tab) <- c("schoolid","u0","u0se")
u0tab[1:10, ]
## schoolid u0 u0se
## [1,] 1 -11.841271 5.711611
## [2,] 2 3.206333 1.697110
## [3,] 3 3.396003 2.242029
## [4,] 4 -7.415008 4.289248
## [5,] 5 3.426227 2.657075
## [6,] 6 12.433731 1.968681
## [7,] 7 -1.651930 2.131068
## [8,] 8 20.978767 4.085753
## [9,] 9 -8.694877 41.445299
## [10,] 10 1.737382 3.626897
u0tab <- u0tab[order(u0tab[,2]), ]
u0tab <- cbind(u0tab, c(1:dim(u0tab)[1]))
colnames(u0tab)[4] <- "u0rank"
u0tab <- u0tab[order(u0tab[,1]), ]
u0tab[1:10, ]
## schoolid u0 u0se u0rank
## [1,] 1 -11.841271 5.711611 37
## [2,] 2 3.206333 1.697110 337
## [3,] 3 3.396003 2.242029 344
## [4,] 4 -7.415008 4.289248 73
## [5,] 5 3.426227 2.657075 345
## [6,] 6 12.433731 1.968681 487
## [7,] 7 -1.651930 2.131068 199
## [8,] 8 20.978767 4.085753 508
## [9,] 9 -8.694877 41.445299 59
## [10,] 10 1.737382 3.626897 291
schoolid<-u0tab[,1]
u0val<-u0tab[,2]
u0se<-u0tab[,3]
u0rank<-u0tab[,4]
plot(u0rank, u0val, type = "n", xlab = "u_rank", ylab = "conditional modes of r.e. for school_id:_cons")
segments(u0rank, u0val - 1.96*u0se, u0rank, u0val + 1.96*u0se)
points(u0rank, u0val, col = "blue")
abline(h = 0, col = "red")
