Content

Using Julia for multilevel modelling

Dataset

Model

Variables

scale : attitudes towards immigration (0-10, 33 values) agea : age (quantitative) pays / cntry : pays essround : vague dbenquC*te

With Stata

mixed scale agea [w=w] || pays: agea || essround: agea

predict yhat , fitted

Results

(sampling weights assumed)

Obtaining starting values by EM: 

Performing gradient-based optimization: 

Iteration 0:   log pseudolikelihood = -371021.98  
Iteration 1:   log pseudolikelihood = -371021.98  

Computing standard errors:

Mixed-effects regression                        Number of obs      =    170359

-----------------------------------------------------------
                |   No. of       Observations per Group
 Group Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
           pays |       16       7590    10647.4      16201
       essround |       96       1149     1774.6       2858
-----------------------------------------------------------

                                                Wald chi2(1)       =     56.69
Log pseudolikelihood = -371021.98               Prob > chi2        =    0.0000

                                  (Std. Err. adjusted for 16 clusters in pays)
------------------------------------------------------------------------------
             |               Robust
       scale |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        agea |   -.011606   .0015415    -7.53   0.000    -.0146273   -.0085848
       _cons |   5.823571   .1225367    47.53   0.000     5.583403    6.063738
------------------------------------------------------------------------------

------------------------------------------------------------------------------
                             |               Robust           
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
pays: Independent            |
                   var(agea) |   .0000309   .0000109      .0000154    .0000618
                  var(_cons) |   .2100469   .0697941      .1095167    .4028584
-----------------------------+------------------------------------------------
essround: Independent        |
                   var(agea) |   .0000104   2.37e-06      6.70e-06    .0000163
                  var(_cons) |    .045416   .0118672      .0272139    .0757927
-----------------------------+------------------------------------------------
               var(Residual) |    3.96908   .1949909      3.604727     4.37026
------------------------------------------------------------------------------

R + lme4

library(lme4)
## Loading required package: Matrix
## Loading required package: Rcpp
df <- read.csv("/home/me/df.csv")

fm1 <- lmer(scale ~ agea2 + ( 1 + agea2 | cntry ) + ( 1 + agea2 | essround), data=df, weights=w)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
summary(fm1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: scale ~ agea2 + (1 + agea2 | cntry) + (1 + agea2 | essround)
##    Data: df
## Weights: w
## 
## REML criterion at convergence: 823608.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -8.0905 -0.3942  0.0411  0.4416 10.3865 
## 
## Random effects:
##  Groups   Name        Variance  Std.Dev.  Corr 
##  cntry    (Intercept) 2.285e-01 0.4780656      
##           agea2       3.572e-05 0.0059769 -0.02
##  essround (Intercept) 1.463e-02 0.1209345      
##           agea2       6.481e-07 0.0008051 0.16 
##  Residual             4.131e+00 2.0324689      
## Number of obs: 170359, groups:  cntry, 16; essround, 6
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept)  5.825191   0.131082   44.44
## agea2       -0.011668   0.001591   -7.33
## 
## Correlation of Fixed Effects:
##       (Intr)
## agea2 -0.046

Julia + MixedModels

julia -p 4

using DataFrames
using MixedModels

df = readtable("/home/me/df.csv")


fm1  = fit(lmm(scale ~ 1 + agea + (1+ agea |pays) + (1 + agea |essround), df))


Linear mixed model fit by maximum likelihood
Formula: scale ~ 1 + agea + ((1 + agea) | pays) + ((1 + agea) | essround)

 logLik: -354086.877907, deviance: 708173.755814

Variance components:
               Variance        Std.Dev.     Corr.
 pays     0.228742671708289 0.47827050056
          0.000023400329230 0.00483738868 -0.02
 essround 0.009791089252406 0.09894993306
          0.000000027877676 0.00016696609  1.00
 Residual 3.736088501904096 1.93289640227
 Number of obs: 170359; levels of grouping factors: 16, 6

  Fixed-effects parameters:
               Estimate  Std.Error  z value
(Intercept)     5.84238   0.126896  46.0406
agea         -0.0116274 0.00123927 -9.38241

Calculation time

Differences

In Stata you get the results by predicting random effects and fitted values (additional computation)

In R and Julia, one only needs to extract fixed and random effects : fixef() and ranef()

Technical differences (According to Douglas Bates)

nlme (old R library) :

LME 4 (new R module)

MixedModels (Julia)

General comments by Bates :

bBecause I was involved in the development of all three packages I will take the liberty of commenting on the philosophy.b"

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2014q4/022791.html

Conclusion

Julia remains a work in progress It may not still be production ready But it allows the estimation of complex models a lot easier than alternatives

Solid arguments in favor of Julia