The SAGE Handbook of Multilevel Modeling
Publication Year: 2013
Subject: Quantitative/Statistical Research
In this important new Handbook, the editors have gathered together a range of leading contributors to introduce the theory and practice of multilevel modeling.
The Handbook establishes the connections in multilevel modeling, bringing together leading experts from around the world to provide a roadmap for applied researchers linking theory and practice, as well as a unique arsenal of state-of-the-art tools. It forges vital connections that cross traditional disciplinary divides and introduces best practice in the field.
Part I establishes the framework for estimation and inference, including chapters dedicated to notation, model selection, fixed and random effects, and causal inference; Part II develops variations and extensions, such as nonlinear, semiparametric and latent class models; Part III includes discussion of missing data and robust methods, assessment of fit and ...
- Front Matter
- Back Matter
- Subject Index
Part I: Multilevel Model Specification and Inference
- Chapter 1: The Multilevel Model Framework
- Chapter 2: Multilevel Model Notation—Establishing the Commonalities
- Chapter 3: Likelihood Estimation in Multilevel Models
- Chapter 4: Bayesian Multilevel Models
- Chapter 5: The Choice between Fixed and Random Effects
- Chapter 6: Centering Predictors and Contextual Effects
- Chapter 7: Model Selection for Multilevel Models
- Chapter 8: Generalized Linear Mixed Models—Overview
- Chapter 9: Longitudinal Data Modeling
- Chapter 10: Complexities in Error Structures within Individuals
- Chapter 11: Design Considerations in Multilevel Studies
- Chapter 12: Multilevel Models and Causal Inference
Part II: Variations and Extensions of the Multilevel Model
- Chapter 13: Multilevel Functional Data Analysis
- Chapter 14: Nonlinear Models
- Chapter 15: Generalized Linear Mixed Models: Estimation and Inference
- Chapter 16: Categorical Response Data
- Chapter 17: Smoothing and Semiparametric Models
- Chapter 18: Penalized Splines and Multilevel Models
- Chapter 19: Hierarchical Dynamic Models
- Chapter 20: Mixture and Latent Class Models in Longitudinal and Other Settings
- Chapter 21: Multivariate Response Data
Part III: Practical Considerations in Model Fit and Specification
- Chapter 22: Robust Methods for Multilevel Analysis
- Chapter 23: Missing Data
- Chapter 24: Lack of Fit, Graphics, and Multilevel Model Diagnostics
- Chapter 25: Multilevel Models: Is GEE a Robust Alternative in the Presence of Binary Endogenous Regressors?
- Chapter 26: Software for Fitting Multilevel Models
Part IV: Selected Applications
- Chapter 27: Meta-Analysis
- Chapter 28: Modeling Policy Adoption and Impact with Multilevel Methods
- Chapter 29: Multilevel Models in the Social and Behavioral Sciences
- Chapter 30: Survival Analysis and the Frailty Model
- Chapter 31: Point-Referenced Spatial Modeling
- Chapter 32: Market Research and Preference Data
- Chapter 33: Multilevel Modeling of Social Network and Relational Data
© Editorial arrangement and Introduction © Jeffrey S. Simonoff, Marc A. Scott and Brian D. Marx 2013
Chapter 1 © Jeff Gill & Andrew Womack 2013
Chapter 2 © Marc A. Scott, Patrick E. Shrout & Sharon L. Weinberg 2013
Chapter 3 © Harvey Goldstein 2013
Chapter 4 © Ludwig Fahrmeir, Thomas Kneib & Stefan Lang 2013
Chapter 5 © Zac Townsend, Jack Buckley, Masataka Harada & Marc A. Scott 2013
Chapter 6 © Craig K. Enders 2013
Chapter 7 © Russell Steele 2013
Chapter 8 © Geert Verbeke & Geert Molenberghs 2013
Chapter 9 © Nan M. Laird & Garrett M. Fitzmauric 2013
Chapter 10 © Vicente Núñez-Antón & Dale L. Zimmerman 2013
Chapter 11 © Gerard van Breukelen & Mirjam Moerbeek 2013
Chapter 12 © Jennifer Hill 2013
Chapter 13 © Ciprian M. Crainiceanu, Brian S. Caffo & Jeffrey S. Morris 2013
Chapter 14 © Lang Wu & Wei Liu 2013
Chapter 15 © Charles E. McCulloch & John M. Neuhaus 2013
Chapter 16 © Jeroen Vermunt 2013
Chapter 17 © Jin-Ting Zhang 2013
Chapter 18 © Göran Kauermann & Torben Kuhlenkasper 2013
Chapter 19 © Marina Silva Paez & Dani Gamerman 2013
Chapter 20 © Ryan P. Browne & Paul D. McNicholas 2013
Chapter 21 © Helena Geys & Christel Faes 2013
Chapter 22 © Joop Hox 2013
Chapter 23 © Geert Molenberghs & Geert Verbeke 2013
Chapter 24 © Gerda Claeskens 2013
Chapter 25 © Robert Crouchley 2013
Chapter 26 © Andrzej T. Gałecki & Brady T. West 2013
Chapter 27 © Larry V. Hedges & Kimberly S. Maier 2013
Chapter 28 © James E. Monogan III 2013
Chapter 29 © David Rindskopf 2013
Chapter 30 © Ardo van den Hout & Brian D.M. Tom 2013
Chapter 31 © Andrew O. Finley & Sudipto Banerjee 2013
Chapter 32 © Adam Sagan 2013
Chapter 33 © Marijtje A. J. van Duijn 2013
First published 2013
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency.
Enquiries concerning reproduction outside those terms should be sent to the publishers.
Library of Congress Control Number is Available
British Library Cataloguing in Publication data
A catalogue record for this book is available from the British Library
SAGE Publications Ltd
1 Oliver's Yard
55 City Road
London EC1Y 1SP
SAGE Publications Inc.
2455 Teller Road
Thousand Oaks, California 91320
SAGE Publications India Pvt Ltd
B 1/I 1 Mohan Cooperative Industrial Area
New Delhi 110 044
SAGE Publications Asia-Pacific Pte Ltd
3 Church Street
#10-04 Samsung Hub
Editor: Jai Seaman
Assistant editor: Anna Horvai
Production editor: Ian Antcliff
Copyeditor: Richard Leigh
Proofreader: Derek Markham
Indexer: David Rudeforth
Marketing manager: Ben Griffin-Sherwood
Cover design: Wendy Scott
Typeset by: C&M Digitals (P) Ltd, Chennai, India
Printed in Great Britain by: Henry Ling
Limited, at the Dorset Press, Dorchester, DT1 1HP
Notes on Contributors[Page ix]
Sudipto Banerjee received an MS and PhD in Statistics from the University of Connecticut. Prior to this he received a BS (Honors) from Presidency College and an MStat from the Indian Statistical Institute, both in Calcutta (now called Kolkata), India. He is currently a tenured Professor of Biostatistics in the School of Public Health, University of Minnesota Twin Cities. His research focuses upon statistical modeling and analysis of geographically referenced datasets, Bayesian statistics (theory, methods, and applications), statistical computing/software, and the melding of numerical/physical models with field data from industrial hygiene. He has published over 60 peer-reviewed journal articles, several book chapters, and has co-authored a book titled Hierarchical Modeling and Analysis for Spatial Data (Chapman & Hall/CRC, 2009). He has overseen the development of several spatial software packages within the R statistical framework. In 2009 he was honored with the Abdel El Shaarawi Award from The International Environmetrics Society – accorded to a young investigator (below the age of 40) every year who has made outstanding contributions to the field of environmetrics. Sudipto is also an elected member of the International Statistical Institute and is an inductee of the Pi chapter of the Delta Omega National Honor Society.
Gerard van Breukelen is Associate Professor at the Department of Methodology and Statistics, Maastricht University, the Netherlands. He obtained his master's degree (cum laude) and PhD in mathematical psychology and psychometrics. He now specializes in the design and analysis of intervention studies in psychology and health sciences, in particular sample sizes for multilevel experiments and confounding in nonrandomized studies. He has been advisor to several PhD projects in optimal design, including that of his present co-author, and he is author and co-author to numerous publications in statistics, psychology, health sciences, and medicine. His teaching ranges from logistic regression and factor analysis to multilevel analysis and structural equation modeling.
Ryan P. Browne is Assistant Professor at the Department of Mathematics and Statistics at the University of Guelph. Ryan holds degrees in statistics from the University of Waterloo: BMath 2004, MMath 2006, and PhD 2009. Based on his PhD work, Ryan and his advisors were awarded the 2011 W.J. Youden Award in Interlaboratory Testing by the American Statistical Association. Ryan's current research focus is on model-based clustering and classification. In addition, he is interested in measurement models, specifically in assessing the quality of a measurement system, which was the focus of his PhD thesis.
Sean P. “Jack” Buckley is Commissioner of the National Center for Education Statistics (NCES). He is currently on leave from New York University, where he is Associate Professor of Applied Statistics. He also served previously as Deputy Commissioner of NCES from 2006 to 2008. Buckley is known for his research on school choice, particularly charter schools, and on statistical methods for public policy. He has also taught statistics and education policy at Georgetown University, Boston College, and the State University of New York at Stony Brook. Buckley also [Page x]spent five years in the U.S. Navy as a surface warfare officer and nuclear reactor engineer, and he worked in the intelligence community as an analytic methodologist. He holds an AB in Government from Harvard and an MA and PhD in political science from SUNY Stony Brook.
Brian S. Caffo received his PhD from the University of Florida's Department of Statistics in 2001. He is Professor in the Department of Biostatistics at Johns Hopkins University. He has worked in the areas of statistical computing, categorical data analysis, and imaging data. He has a particular focus in the analysis of biological heterogeneity in brain imaging data for basic science and clinical research. He has developed statistical analysis methods for the analysis of MRI, functional MRI, Diffusion Tensor Imaging (DTI), EEG, electrocorticography (ECoG) and nuclear medicine imaging data. With Dr. Crainiceanu, he leads the Statistical Methods and Applications for Research in Technology (SMART, http://www.smart-stats.org) research group at Johns Hopkins. The group, boasting a diverse collection of faculty, postdoctoral fellows, and students, is leading efforts in the analysis of high-dimensional longitudinal and multilevel functional data, especially in the field of medical imaging.
Gerda Claeskens is Professor of Statistics affiliated with the research center OR and Business Statistics and with the Leuven Statistics Research Center of the KU Leuven, Belgium. Her research topics include lack-of-fit and goodness-of-fit testing, model selection, model averaging, and semi- and nonparametric estimation. She received her PhD degree from Limburgs Universitair Centrum in 1999, and previously held positions at TU Eindhoven, the Australian National University, and Texas A&M University. She co-authored the 2008 book Model Selection and Model Averaging (Claeskens and Hjort, Cambridge University Press, 2008), and is the author of numerous journal publications on the abovementioned research topics.
Ciprian M. Crainiceanu received his PhD in Statistics from Cornell University in 2003 and is an Associate Professor of Biostatistics at Johns Hopkins University. He is a specialist in functional, longitudinal, and nonparametric methods. His main expertise is in the area of massive and complex observational studies obtained from the integration of new measurement technologies and classical observational studies. His scientific interests are centered on biosignals such as electroencephalograms (EEG), multimodality Magnetic Resonance Imaging (MRI), and human activity measurements obtained from wearable devices. Together with Dr Caffo, Dr Crainiceanu leads the Statistical Methods for Analysis of Research Technologies (SMART, http://www.smart-stats.org) research group. Dr Crainiceanu is also a specialist in measurement error modeling; he co-authored the second edition of the monograph Measurement Error in Nonlinear Models: A Modern Perspective.
Robert Crouchley obtained a PhD in Mathematics (Statistics) from Imperial College, London, in 1991. His supervisor was Professor Sir David Cox. Rob is Professor of Applied Statistics in the Department of Economics, Lancaster University, and was (2002–2012) Director of the Lancaster University Management School (LUMS) Centre for e-Science; he has directed research projects on the development and deployment of grid-enabled statistical computing software. Rob also lectures on econometrics. He has published many applied research papers as well as writing and editing several books on statistical modeling. He is co-author of the book Multivariate generalized Linear Mixed Models Using R (Chapman & Hall/CRC, 2011) with Daman Berridge. He currently represents the UK's Royal Statistical Society on the Government Statistical Service Methodology Advisory Committee. His research interests include statistical modeling, He development of database management, and grid computing systems.[Page xi]
Marijtje A.J. van Duijn is Associate Professor of Statistics in the Department of Sociology, University of Groningen. Her research interests are in the development and application of random effects (multilevel) models for discrete or complex data, such as longitudinal, grouped, or social network data, and often inspired by collaboration with social scientists. Together with Mark Huisman, she contributed two chapters to the Sage Handbook on Social Network Analysis (SAGE, 2011), one on statistical models, the other on software for social network analysis.
Craig Enders, PhD, is an Associate Professor in the Quantitative Psychology concentration in the Department of Psychology at Arizona State University, where he teaches graduate-level courses in missing data analyses, multilevel modeling, and longitudinal modeling. The majority of his research focuses on analytic issues related to missing data analyses and multilevel modeling. His book, Applied Missing Data Analysis, was published by Guilford Press in 2010.
Christel Faes is Professor of Statistics in the Interuniversity Institute for Biostatistics and Statistical Bioinformatics at the Hasselt University in Belgium. She received an MS degree in Mathematics (2000) from the University of Hasselt and a PhD degree in Biostatistics (2004) from the Hasselt University. Her research interests include statistical methods for correlated, multivariate, clustered, and spatial data, with applications in infectious diseases, veterinary epidemiology, and non-clinical studies. She serves as associate editor for Biometrics.
Ludwig Fahrmeir is Professor Emeritus for Statistics at the Department of Statistics, Ludwig-Maximilians-University of Munich. He obtained both his PhD and his Habilitation degree at the Technical University of Munich. After a short period as a visiting professor at the University of Dortmund, he was appointed full professor at the University of Regensburg in 1978. He returned to his hometown, Munich, in 1991 and became Professor at the Department of Statistics of the LMU. From 1995 to 2006 he served as the speaker of the collaborative research center “Statistical Analysis of Discrete Structures, with applications in Econometrics and Biometrics,” funded by the German National Science Foundation. His research interests cover asymptotic theory, state space models and semiparametric regression, longitudinal and spatial data analysis, Bayesian statistics, and a broad range of applications such as credit risk, development economics, and neuroscience. Becoming Professor Emeritus in 2010, he has increased free time to enjoy family life, skiing, traveling and guitar playing.
Andrew O. Finley received a PhD in Natural Resources Science and Management and an MS in Statistics from the University of Minnesota. He is currently an Assistant Professor in the Departments of Forestry and Geography at the Michigan State University. His research interests lie in developing methodologies for monitoring and modeling environmental processes, Bayesian statistics, spatial statistics, and statistical computing. Andrew received the 2009 American Statistical Association Section on Statistics and the Environment's Young Investigator Award.
Garrett Fitzmaurice is Professor of Psychiatry (Biostatistics) at the Harvard Medical School and Professor in the Department of Biostatistics at the Harvard School of Public Health. He is a Fellow of the American Statistical Association and a member of the International Statistical Institute. He has served as Associate Editor for Biometrics, the Journal of the Royal Statistical Society, Series B, and Biostatistics; currently, he is Statistics Editor for the journal Nutrition. His research and teaching interests are in methods for analyzing longitudinal and repeated measures data. A major focus of his methodological research has been on the development of [Page xii]statistical methods for analyzing repeated binary data and for handling the problem of attrition in longitudinal studies. Much of his collaborative research has concentrated on applications to mental health research, broadly defined. He co-authored the textbook Applied Longitudinal Analysis, 2nd edition (Wiley, 2011) and co-edited the book Longitudinal Data Analysis (Chapman & Hall/CRC Press, 2008).
Andrzej Gałecki is a Research Professor in the Division of Geriatric Medicine, Department of Internal Medicine, and Institute of Gerontology at the University of Michigan Medical School, and is Research Scientist in the Department of Biostatistics at the University of Michigan School of Public Health. He earned his MSc in Applied Mathematics (1977) from the Technical University of Warsaw, Poland, and an MD (1981) from the Medical University of Warsaw. In 1985 he earned a PhD in Epidemiology from the Institute of Mother and Child Care in Warsaw (Poland). He is a member of the Editorial Board of the Open Journal of Applied Sciences. Since 1990, Dr Galecki has collaborated with researchers in gerontology and geriatrics. His research interests lie in the development and application of statistical methods for analyzing correlated and over-dispersed data. He developed the SAS macro NLMEM for nonlinear mixed effects models, specified as a solution to ordinary differential equations. He also proposed a general class of variance–covariance structures for the analysis of multiple continuous dependent variables measured over time. This methodology is considered to be one of first approaches to joint models for longitudinal data.
Dani Gamerman earned a degree in Mechanical Engineering from the Military Engineering Institute, Rio de Janeiro, in 1980, an MSc in Statistics from the Instituto Nacional de Matemática Pura e Aplicada (IMPA) in 1983, and a PhD in Statistics from the University of Warwick in 1987. He has been Professor of Statistics at UFRJ since 1996, where he supervises MSc and PhD students. He has been an invited lecturer at various scientific meetings in Brazil and abroad. He has been a visiting lecturer of a few Universities in Brazil and abroad, and colaborador honorífico of Universidade Rey Juan Carlos in Madrid. He is author of the books Monte Carlo Markov Chain: Stochastic Simulation for Bayesian Inference, published by Chapman and Hall in 1997 (1st edition) and 2006 (2nd edition, with Hedibert F. Lopes) and Statistical Inference: An Integrated Approach (with Helio S. Migon), published by Arnold in 1999. He has published papers in many statistical journals, including Journal of the Royal Statistical Society, Series B and Biometrika. His current research interests include dynamic models, item response theory, spatial statistics, survival analysis, stochastic simulation, econometrics, and Bayesian inference.
Helena Geys is Professor of Statistics in the Interuniversity Institute for Biostatistics and Statistical Bioinformatics at the University of Hasselt in Belgium, and Associate Director of Nonclinical Statistics at Janssen Pharmaceutical Companies of Johnson & Johnson. She holds a master's degree in Mathematics from the University of Antwerp and a PhD degree in biostatistics from the University of Hasselt. She published methodological work on clustered nonnormal data with applications in developmental toxicity, risk assessment, spatial epidemiology, surrogate marker validation, and pseudo-likelihood inference. She served on the editorial panel of the Archives of Public Health, Belgium, was associate editor for the Journal of Agricultural, Biological and Environmental Statistics (JABES), and was a member of the JABES management committee. She has authored or co-authored more than 50 publications in major statistical and epidemiological journals and is one of the editors of a book on topics in modeling clustered data.
Jeff Gill is Professor of Political Science, Professor of Biostatistics, and Professor of Surgery (Public Health Sciences) at Washington University in St. Louis. He does extensive work in the [Page xiii]development of Bayesian hierarchical models, nonparametric Bayesian models, elicited prior development from expert interviews, as well as in fundamental issues in statistical inference. He has extensive expertise in statistical computing, and Markov chain Monte Carlo (MCMC) tools in particular. His current theoretical work centers on new hybrid algorithms for product partition clustering, Bayesian meta-modeling, and Polya-tree mixture models. Current applied work includes energetics and cancer, long-term mental health outcomes from children's exposure to war, pediatric head trauma, and assessment of transplant centers.
Harvey Goldstein was formerly Professor of Statistical Methods at the Institute of Education from 1997–2005. He is currently Professor of Social Statistics at the University of Bristol. He has been a member of the Council of the Royal Statistical Society, and chair of its Educational Strategy Group. He is currently chair of the technical advisory committee for the RSS Centre for Statistical Education. He was awarded the RSS Guy medal in Silver in 1998 and was elected a fellow of the British Academy in 1997. He has been the principal applicant on several major ESRC-funded research projects since 1981. His current major research interest is in the methodology of multilevel modeling. His recent book, Multilevel Statistical Models (Wiley, 2011, 4th edition) is the standard reference text in this important area of statistical data analysis. Most recently he has helped to develop efficient methods for handling missing data in multilevel models and procedures for unbiased and efficient record linkage of large datasets.
Masataka Harada is a postdoctoral research scientist at New York University's Center for the Promotion of Research Involving Innovative Statistical Methodology (PRIISM), where he is developing methods of sensitivity analysis, some of which (Imbens’ Sensitivity Analysis, -isa-, and Generalized Sensitivity Analysis, -gsa-) are available as Stata ado-files on the Statistical Software Components (SSC) website. He specializes in econometrics and political science. His main interest is causal inference of political theories using quasi-experimental methods. His recent work includes the quasi-experimental investigation of strategic use of debts in Southern States in the U.S. when the Voting Rights Act was enacted in 1965. He received an MA in Social Science and PhD in Public Policy from the University of Chicago.
Larry V. Hedges is the Board of Trustees Professor of Statistics and Social Policy and a faculty fellow of the Institute for Policy Research at Northwestern University. Dr Hedges is a national leader in the fields of educational statistics and evaluation. His research focuses broadly on the development and application of statistical methods for the social, medical, and biological sciences. He is best known for his work on developing statistical methods for meta-analysis. Widely published, Hedges has authored or co-authored numerous journal articles and books, including Statistical Methods for Meta-Analysis: A Practical Guide to Modern Methods of Meta-Analysis (with I. Olkin, Academic Press, 1985) and The Handbook of Research Synthesis (with H. Cooper, Russell Sage Foundation, 1993). Dr Hedges is a member of the National Academy of Education and the Society of Multivariate Experimental Psychology and a fellow of the American Statistical Association and the American Psychological Association.
Jennifer Hill is an Associate Professor of Applied Statistics and Co-director of the Center for the Promotion of Research Involving Innovative Statistical Methodology (PRIISM) at the Steinhardt School for Culture, Education, and Human Development at New York University. Hill earned her PhD in Statistics from Harvard University, followed by a postdoctoral fellowship in Child and Family Policy at Columbia University's School of Social Work. Hill's research for many years has negotiated the intersection between statistical methodology and research in the social and behavioral sciences, as well as educational policy. She is interested in methods [Page xiv]and study designs that allow researchers to go beyond making purely associational observations to having the capacity to potentially answer causal questions. In particular, she focuses on situations in which it is difficult or impossible to perform traditional randomized experiments, or when even seemingly pristine study designs are complicated by missing data or hierarchically structured data. Hill has published in a variety of leading journals including Journal of the American Statistical Association, American Political Science Review, American Journal of Public Health, and Developmental Psychology.
Ardo van den Hout is a lecturer at the Department of Statistical Science at University College, London, UK. He has previously worked in the Medical Research Council Biostatistics Unit, Cambridge, UK. His primary research areas are multi-state survival models and longitudinal data analysis, with applications in epidemiology and biostatistics.
Joop Hox is full professor of Social Science Methodology at the department of Methodology and Statistics of the Faculty of Social Sciences at Utrecht University. As Methodology chair, he is responsible for the research, development, and teaching carried out at the faculty in the field of social science methods and techniques. His research interests are multilevel modeling and data quality in surveys. He has been invited speaker at several conferences and is chair of the Amsterdam Multilevel conferences, held every two years. He is editor of the journal Methodology, a founding member of the European Association of Methodology (EAM), and past editor of the EAM book series.
Göran Kauermann is Professor of Statistics at the Ludwig-Maximilian-University (LMU) Munich, Germany. After receiving his PhD in statistics at the Technical University Berlin, Germany, in 1994 he spent a year as visiting scholar at the University of Chicago, USA. He became Assistant Professor at the LMU in Munich, Germany, in 1998, and worked at the University of Glasgow, Scotland, as Senior Lecturer from 2000 to 2003. In 2003 he was appointed as full Professor of Statistics at Bielefeld University, Germany. Professor Kauermann is currently Editor of AStA – Advances in Statistical Analysis and from 2005 to 2013 he was Chair of the Deutsche Arbeitsgemeinschaft Statistik, a consortium of all German statistical societies. His research focuses on nonparametric models and penalized regression in generalized linear and generalized mixed models.
Thomas Kneib is Professor of Statistics at the Department of Economics, University of Göttingen, and speaker of the interdisciplinary Centre for Statistics at the University of Göttingen. He received his PhD in statistics in 2006 from the Department of Statistics, University of Munich. During his time as postdoctoral researcher, from 2006 until 2009, he was visiting Professor for Applied Statistics at the Faculty of Mathematics and Economics, University of Ulm, and substitute Professor for Statistics at the Department of Economics, University of Göttingen. In 2009 he became Professor for Applied Statistics at the department of mathematics, University of Oldenburg, before he moved to Göttingen in 2011. His research interests are semiparametric regression, spatial statistics, quantile and expectile regression, Bayesian statistics, and statistical learning techniques.
Torben Kuhlenkasper is Junior Professor of Applied Econometrics at Goethe University Frankfurt, Germany, and a Research Fellow of the Hamburg Institute of International Economics (HWWI). He graduated in Business Administration and Economics at Bielefeld University, Germany, in 2006 and 2007, where he also received his PhD in 2011. Before being appointed as a Junior Professor in 2012 he worked as a Senior Economist at HWWI in the field [Page xv]of empirical economic research. His research focuses on the application of non- and semiparametric regression models in econometrics, especially in labor economics and its related fields.
Nan Laird is the Harvey V. Fineberg Professor of Biostatistics at the Harvard School of Public Health. Dr. Laird has contributed to methodology in many different fields, including meta-analysis, statistical genetics, and longitudinal data. She is a co-author of the book Applied Longitudinal Analysis with Garrett Fitzmaurice and James Ware, and is a co-author of the book Fundamentals of Modern Statistical Genetics with Christoph Lange. She is the recipient of many awards and prizes, including Fellow of the American Statistical Association, and the Samuel Wilks Award in 2011.
Stefan Lang is Professor for Applied Statistics at the Department of Statistics, University of Innsbruck. He is editor of Advances in Statistical Analysis (ASTA) and associate editor of Statistical Modeling. He received his PhD in statistics in 2001 from the Department of Statistics, University of Munich. He became Professor of Statistics at the University of Leipzig in 2005. He finally moved to the University of Innsbruck in October 2006. His research interests cover Bayesian statistics, semiparametric regression, spatial statistics, multilevel models, and applications in ecology, marketing science, development economics, etc.
Wei Liu received a BSc in Mathematics and an MSc in Statistics from Northeast Normal University in China. Then she got another MSc and a PhD in Statistics from Memorial University of Newfoundland and the University of British Columbia in Canada, respectively. Since 2007, she has been an Assistant Professor in the Department of Mathematics and Statistics at York University in Toronto, Canada. Her current research interests include longitudinal data, missing data, measurement errors, mixed effects models, and order-restricted statistical inference.
Kimberly S. Maier is an Assistant Professor of Measurement and Quantitative Methods and an affiliate of the Educational Policy program in the College of Education at Michigan State University. Dr Maier's research focuses on the development of statistical methods for complex data structures, with an emphasis on the development and application of extensions to multilevel models for policy research. Currently, Maier is studying the application of multilevel item response theory to education achievement measures and attitudinal surveys using Bayesian techniques. Maier's methodological work as author or co-author appears in journals such as Journal of Educational and Behavioral Statistics and Structural Equation Modeling: A Multidisciplinary Journal.
Brian D. Marx is a full professor in the Department of Experimental Statistics at Louisiana State University. His main research interests include P-spline smoothing, ill-conditioned regression problems, and high-dimensional chemometric applications. He is currently serving as coordinating editor for the journal Statistical Modelling and is past chair of the Statistical Modelling Society. He has also taught as a visiting professor at Stanford University, University of Munich, and Utrecht University. He is co-author of the forthcoming (2013) book Regression: Models, Methods, and Applications, with Ludwig Fahrmeir, Thomas Kneib, and Stefan Lang.
Charles E. McCulloch is Professor of Biostatistics in the Division of Biostatistics in the Department of Epidemiology and Biostatistics at the University of California, San Francisco. He is a Fellow of the American Statistical Association, and author (with John M. Neuhaus and Shayle R. Searle) of Generalized, Linear, and Mixed Models, now in its second edition.[Page xvi]
Paul D. McNicholas is a Professor and University Research Chair in Computational Statistics at the Department of Mathematics and Statistics at the University of Guelph. Prior to taking a position at Guelph, he completed a PhD in Statistics, an MSc in High Performance Computing, and an MA in Mathematics at Trinity College Dublin, Ireland. His research interests are in computational statistics; with a focus in classification and clustering using mixture models.
Mirjam Moerbeek is Associate Professor, Department of Methods and Statistics, Utrecht University, the Netherlands. She graduated with a master's degree in biometrics (cum laude) and wrote a PhD thesis on the design and analysis of experiments with multilevel data. She specializes in statistical power analysis and the optimal design of experiments, in particular for multilevel and longitudinal data. She is joint organizer of courses, conferences, and symposia on optimal design, multilevel analysis, and longitudinal data analysis. She has received prestigious research grants from the Netherlands Organization for Scientific Research to build her own research group and to supervise PhD students.
Geert Molenberghs is Professor of Biostatistics at Universiteit Hasselt and Katholieke Universiteit Leuven in Belgium. He received a BS degree in Mathematics (1988) and a PhD in biostatistics (1993) from Universiteit Antwerpen. He has published on surrogate markers in clinical trials, and on categorical, longitudinal, and incomplete data. He was Joint Editor of Applied Statistics (2001– 2004), Co-editor of Biometrics (2007–2009), and Co-editor of Biostatistics (2010–2012). He was President of the International Biometric Society (2004–2005), received the Guy Medal in Bronze from the Royal Statistical Society and the Myrto Lefkopoulou award from the Harvard School of Public Health. Geert Molenberghs is founding director of the Center for Statistics. He is also the director of the Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat). Jointly with Geert Verbeke, Mike Kenward, Tomasz Burzykowski, Marc Buyse, and Marc Aerts, he has authored books on longitudinal and incomplete data, and on surrogate marker evaluation. Geert Molenberghs received several Excellence in Continuing Education Awards from the American Statistical Association for courses at Joint Statistical Meetings.
James E. Monogan III is an Assistant Professor in the Department of Political Science at the University of Georgia. He received his PhD from the University of North Carolina at Chapel Hill in 2010. His primary research interests are in political methodology and American politics. Within political methodology, he focuses on the analysis of geographically referenced and time-dependent data. In American politics, he studies the quality of representation in the United States, public opinion, elections, and state politics and policy. His research has been published in the Journal of Politics, State Politics & Policy Quarterly, Publius: The Journal of Federalism, and the Journal of Theoretical Politics.
Jeffrey S. Morris is a Professor and Deputy Chair in the Department of Biostatistics at the University of Texas M.D. Anderson Cancer Center. He received his PhD in Statistics from Texas A&M University in 2000. His main research contributions include developing new methods for complex, high-dimensional object data such as functional and quantitative image data, bioinformatics methods for high-throughput genomic and proteomic technologies, and collaborative medical and biological research in cancer. His methods have been applied to a variety of data, including carcinogenesis biomarker, actigraphy, microarray, copy number, fMRI, proteomics, and EEG data. His work has been recognized with a number of prestigious awards including the Harvard University Myrto Lefkopoulou Invited Lecture and the Noether Young Investigator Award.[Page xvii]
John M. Neuhaus is Professor of Biostatistics in the Division of Biostatistics in the Department of Epidemiology and Biostatistics at the University of California, San Francisco. He is a Fellow of the American Statistical Association, and author (with Charles E. McCulloch and Shayle R. Searle) of Generalized, Linear, and Mixed Models, now in its second edition.
Vicente Núñez-Antón is Professor of Statistics, Department of Econometrics and Statistics, University of the Basque Country UPV/EHU, Bilbao, Spain. He received his undergraduate degree in Electronic Engineering (1987) from ESPOL (Ecuador), and a Master in Theoretical Statistics (1989) and a PhD in Statistics (1993) from The University of Iowa (USA). His research interests include longitudinal data analysis, survival analysis, nonparametric estimation methods, Bayesian methods, goodness-of-fit testing and health-related quality of life (HRQoL) methods. He is one of the founders and one of the main researchers of the BIOSTATNET Network in Biostatistics. He has authored or co-authored four books and more than 50 scholarly articles in peer-reviewed journals. He is an elected member of the International Statistical Institute. He was associate editor for Applied Statistics, and is currently associate editor for Statistical Modelling and METHODOLOGY – European Journal of Research Methods for the Behavioral and Social Sciences.
David Rindskopf is Distinguished Professor of Educational Psychology and Psychology at the City University of New York Graduate School. He is a Fellow of the American Statistical Association and the American Educational Research Association, Past President of the Society for Multivariate Experimental Psychology, and Editor of the Journal of Educational and Behavioral Statistics. His research interests are categorical data, latent variable models, and multilevel models. Current projects include: (i) showing how people subconsciously use complex statistical methods to make decisions in everyday life, (ii) introducing floor and ceiling effects into logistic regression to model response probabilities constrained to a limited range, (iii) using multilevel models to analyze data from single case designs.
Adam Sagan, PhD is Associate Professor of Marketing Research at Cracow University of Economics, Cracow, Poland. His main research interests lie in two fields of marketing research. The first is the analysis of consumer behavior and market segmentation based on means–end chains theory and laddering interviews. The second field of interest is the application of latent variable and path models in the analysis of buyer–seller interactions and measurement of the dimensions of value for the customer in relationship marketing. He is author of the handbook of marketing research (in Polish) and has published papers in Advances in Data Analysis and Journal of Targeting, Measurement and Analysis for Marketing.
Rens van de Schoot graduated cum laude for the research master's degree in Development and Socialization of Children and Adolescents at Utrecht University in The Netherlands. He obtained his PhD, also cum laude, on informative hypotheses and Bayesian statistics at the department of Methods and Statistics. Currently, he is Assistant Professor at Utrecht University and Extraordinary Professor at the North-West University in South Africa. Besides his research on how to directly evaluate expectations and the labor market position of PhD candidates, he collaborates with many developmental researchers. He is co-editor of the European Journal of Developmental Psychology, president of the young researchers union of the European Association of Developmental Psychology, and he is vice-chair for the scientific committee of the Dutch Institute for Psychologists (NIP).[Page xviii]
Marc A. Scott is an Associate Professor of Applied Statistics at New York University's Steinhardt School of Culture, Education, and Human Development, and co-directs the Center for the Promotion of Research Involving Innovative Statistical Methodology (PRIISM). He received his PhD in Statistics from New York University and spent the next two years as a Senior Research Associate at Columbia University's Institute on Education and the Economy. Scott's research primarily involves the development of statistical models for longitudinal data, including models for sequence analysis and event histories. This methodological work has been motivated through collaborations with policy researchers on topics such as wage inequality and immobility, educational attainment, child development and achievement, disease progression in respiratory illness and reliability of sleep studies. Recent methodological papers have appeared in Statistical Science, Sociological Methodology, Journal of Educational and Behavioral Statistics, Journal of the Royal Statistical Society Series C, and Statistics in Medicine.
Patrick E. Shrout (PhD, 1976, University of Chicago) is Professor of Psychology at NYU, where he teaches statistics and measurement courses to graduate and undergraduate students. Prior to moving to NYU, he was on the biostatistics faculty at Columbia University. His methodological interests are on inferences that can be made from non-experimental data. His substantive interests are stress, social support and coping in intimate relationships, and in cross-cultural studies in psychiatric epidemiology. He is Past-President of both the American Psychopathological Association and the Society of Multivariate Behavioral Research. Shrout has been elected Fellow of the American Association for the Advancement of Science, American Statistical Association, the American Psychological Association (Divisions 5 and 8), the Association for Psychological Science, and the Society of Experimental Social Psychology. He is currently an Associate Editor of Psychological Methods.
Marina Silva Paez earned a degree in Statistics from Universidade Federal do Rio de Janeiro (UFRJ) in 1998, MSc in Statistics from UFRJ in 2000, and PhD in Statistics from UFRJ in 2004. She was a research student at Lancaster University from September 2003 to August 2004, and has been Associate Professor of Statistics at UFRJ since 2005. She supervises MSc and PhD students, and has published papers in statistical journals, including Environmetrics and Environmental and Ecological Statistics, mainly on the subject of space-time Bayesian models. She has been an invited speaker at many scientific meetings in Brazil. Her current research interests include space-time models, spatial statistics, dynamic models, item response theory, and Bayesian inference.
Jeffrey S. Simonoff is Professor of Statistics in the Department of Information, Operations and Management Sciences of the Leonard N. Stern School of Business at New York University. He is a Fellow of the American Statistical Association, a Fellow of the Institute of Mathematical Statistics, and an Elected Member of the International Statistical Institute. He has been at NYU Stern since receiving his PhD in Statistics from Yale University in 1980. He has authored or co-authored more than 90 articles and five books on the theory and applications of statistics, including A Casebook for a First Course in Statistics and Data Analysis with S. Chatterjee and M.S. Handcock (1995), Smoothing Methods in Statistics (1996), Analyzing Categorical Data (2003), Nonprofit Trusteeship In Different Contexts with R. Abzug (2004), and Handbook of Regression Analysis with S. Chatterjee (2013). His research interests include computer-intensive statistical methodology, modern regression methods, categorical data, smoothing methods, and applications of statistics to business, industrial, and scientific problems.
Russell Steele is an Associate Professor in the Department of Mathematics and Statistics at McGill University. He also holds a position as an investigator at the Jewish General Hospital [Page xix]in the Division of Clinical Epidemiology. His primary statistical methodological interests lie in the areas of methods for analyzing data with missing values and Bayesian model selection, although he is more broadly interested in statistical applications. He has a broad range of substantive interests in medicine, publishing work in rheumatology, sports medicine, and design and interpretation of meta-analyses. In particular, he has published several papers in high-impact rheumatology journals with other members of the Canadian Scleroderma Research Group over the last seven years. Russell has also collaborated with scientists in geography and the biological sciences on problems in multilevel modeling.
Brian D.M. Tom is an Investigator Scientist at the UK Medical Research Council Biostatistics Unit and an Affiliated Lecturer at the Department of Pure Mathematics and Mathematical Statistics of the University of Cambridge. He has previously worked as a Consulting Statistician and a Project Statistician within the University of Cambridge's Department of Public Health and Primary Care. His research areas are quite varied, ranging from epidemiology to biostatistics and bioinformatics. On the methodological side, he has particular interests in bias and efficiency, causal and dynamic modeling, event history and survival analysis, integrative genomics, and longitudinal data analysis. On the applied side, he has worked and is working in various chronic and acute disease areas, such as Rheumatology, Hepatitis C and Pandemic Influenza.
Zachary Townsend is a research affiliate at New York University's Center for the Promotion of Research Involving Innovative Statistical Methodology (PRIISM). Zac's research interests include causal inference in policy analysis, and the use of multilevel modeling in education and criminal justice contexts. As a research affiliate, he works closely with the PRIISM faculty on research projects. Zac received a MPA from New York University's Wagner School of Public Service, where he focused on public finance and statistical methods, and an AB in Applied Mathematics–Economics and Public Policy from Brown University.
Geert Verbeke has published extensively on various aspects of mixed models for longitudinal data analyses, about which he co-authored and co-edited several text books (Springer Lecture Notes 1997; Springer Series in Statistics 2000 and 2005; Chapman & Hall/CRC 2009). He held visiting positions at the Gerontology Research Center and the Johns Hopkins University (Baltimore, MD), was International Program Chair for the International Biometric Conference in Montreal (2006), Joint Editor of the Journal of the Royal Statistical Society, Series A (2005– 2008), and Co-editor of Biometrics (2010–2012). He served and serves on a variety of committees of the International Biometric Society, is an elected Fellow of the American Statistical Association and Elected Member of the International Statistical Institute. He was elected international representative in the Board of Directors of the American Statistical Association (2008–2010). Geert Verbeke has earned Excellence in Continuing Education Awards in 2002, 2004, 2008, and 2011 for short courses taught at the Joint Statistical Meetings of the American Statistical Association. He received the International Biometric Society Award for the best Biometrics paper in 2006, and received accreditation as a professional statistician by the American Statistical Association (2010–2016).
Jeroen K. Vermunt is a full professor in the Department of Methodology and Statistics at Tilburg University, the Netherlands. His research is on methodologies of social, behavioral, and biomedical research, with a special focus on latent variable models and techniques for the analysis of categorical, multilevel, and longitudinal data sets. He has widely published on these topics in statistical and methodological journals and has also co-authored many articles in [Page xx]applied journals in which these methods are used to solve practical research problems. He is the co-developer (with Jay Magidson) of the Latent GOLD software package. In 2005, Vermunt was awarded the Leo Goodman award by the Methodology Section of the American Sociological Association.
Sharon L. Weinberg is Professor of Applied Statistics and Psychology and former Vice Provost for Faculty Affairs at New York University. She received an AB degree in mathematics and a PhD degree in psychometrics and research design methodology from Cornell University. Dr Weinberg is the author of over sixty articles, books, and reports on statistical methodology, statistical education, evaluation, and on such applied areas as clinical and school psychology, special education, and higher education. She is the recipient of several major grants from Federal agencies, including the National Science Foundation, the National Institute of Drug Abuse, and the Office of Educational Research and Improvement. Her current textbook, Statistics Using SPSS: An Integrative Approach, co-authored with former graduate student Sarah Knapp Abramowitz, and published by Cambridge University Press, is in its second edition. She is co-editor of Diversity in American Higher Education: Toward a More Comprehensive Approach, published in 2011 by Routledge Press. She is currently on the Editorial Board of Educational Researcher.
Brady T. West is a Research Assistant Professor in the Survey Methodology Program, located within the Survey Research Center at the Institute for Social Research on the University of Michigan–Ann Arbor (U-M) campus. He also serves as a Statistical Consultant at the U-M Center for Statistical Consultation and Research (CSCAR). He earned his PhD from the U-M Program in Survey Methodology in 2011. Before that, he received an MA in Applied Statistics from the U-M Statistics Department in 2002, being recognized as an Outstanding First-year Applied Masters student. His current research interests include the implications of measurement error in auxiliary variables and survey paradata for survey estimation, survey nonresponse, interviewer variance, and multilevel regression models for clustered and longitudinal data. He is the lead author of a book comparing different statistical software packages in terms of their mixed effects modeling procedures (Linear Mixed Models: A Practical Guide using Statistical Software, Chapman Hall/CRC Press, 2006), with a second edition currently being written, and he is a co-author of a second book entitled Applied Survey Data Analysis (with Steven Heeringa and Pat Berglund), which was published by Chapman and Hall in April 2010.
Andrew J. Womack is a Postdoctoral Research Associate in the Department of Statistics at the University of Florida. His research focuses on objective Bayesian model selection, Bayesian hierarchical modeling, and cluster analysis. He is also interested in statistical applications related to gene expression data, signal processing, machine learning, and social science research. Andrew earned a BS from the University of Kansas and a PhD in Mathematics from Washington University in St. Louis.
Lang Wu received a BSc from East China Normal University, an MSc from Tulane University, and a PhD in Statistics from the University of Washington at Seattle. He then joined Harvard School of Public Health as a postdoctoral fellow. Since 2000, he has been an Assistant Professor, Associate Professor, and full Professor in the Department of Statistics at the University of British Columbia in Vancouver, Canada. His current research interests include longitudinal data analysis, missing data and measurement error problems, multivariate one-sided hypothesis testing, and join models. His book Mixed Effects Models for Complex Data (Chapman & Hall/CRC, 2009) provides an overview of mixed effects models, including nonlinear mixed effects models and generalized linear mixed models for longitudinal data and multilevel data.[Page xxi]
Jin-Ting Zhang is currently an Associate Professor, Department of Statistics and Applied Probability, National University of Singapore, and a visiting fellow in the Department of Operation Research and Financial Engineering, Princeton University, USA. He earned his PhD in 1999 at the Department of Statistics, University of North Carolina at Chapel Hill, USA under the supervision of Professors Jianqing Fan and James Steve Marron. He has published two books and a number of research papers in top journals. His research interests include longitudinal data analysis, functional data analysis, high-dimensional data analysis, nonparametric smoothing, and heteroscedastic ANOVA and MANOVA, among others.
Dale L. Zimmerman is Robert V. Hogg Professor, Department of Statistics and Actuarial Science, University of Iowa. He received his PhD in Statistics from Iowa State University in 1986. His research interests include spatial statistics, longitudinal data analysis, multivariate analysis, mixed linear models, and environmental statistics. He has authored or co-authored two books and more than 70 scholarly articles in peer-reviewed journals. He was named a Fellow of the American Statistical Association in 2001, and he received the Distinguished Achievement Award from the Section on Statistics and Environment of the American Statistical Association in 2007. He is currently an associate editor for Journal of the Royal Statistical Society, Series B and for Annals of Applied Statistics.[Page xxii]
Multilevel models (MLMs) are statistical models that, broadly speaking, are characterized by complex patterns of variability, usually focusing on nested structures of, for example, students in schools, animals in litters, longitudinal measurements of individuals or countries, and so on. Such models have a long history, being associated with early work on design and analysis of experiments, repeated measures, and Bayesian models. However, until recently, researchers could not ask or answer many important questions: often the data were too expensive to collect and analysis was hampered by limited computational resources. As late as the mid-1990s, there were only a few software packages for fitting multilevel models. Now, an applied researcher has many software choices, but given that each package was developed by researchers who posed their questions in slightly different frameworks and disciplinary contexts, there are major differences in notation, data structures, and so forth, associated with these packages.
Different approaches have led to similar but distinct strands in the literature to the point at which we have many different names for different forms or formulations of MLMs, such as random effects models, random coefficient models, and hierarchical linear models. The list is much larger if we examine subfields such as panel data models, which are often referred to as longitudinal models, pooled cross-sectional and time-series models, or growth curve models. Mixed effect, random effect, and fixed effect models all have a specific meaning in certain disciplines, but the distinction tends to confuse rather than clarify their purpose. Again, these distinctions often reflect the historical development of the framework, be it biology, psychology, sociology, policy science, market or econometric research, and so forth.
This volume is premised on the notion that researchers across and within a variety of disciplines have much to learn from one another, particularly in the area of multilevel modeling, where many related ideas converge. Moreover, the ways in which these ideas differ from each other are equally illuminating. By examining the common threads and subtle differences between methods that we commonly lump together under the rubric “multilevel modeling,” we hope that the applied researcher will recognize where her problem fits in this arena, and perhaps even discover that the answer to a slightly reformulated question is more informative for her research. We expect that sharing information across disciplinary boundaries will lead to significant gains in productivity as well as open up new areas of inquiry. We note, however, that this is too often hampered by jargon, notation or more simply from a lack of communication and awareness of what has already been done in other fields. This volume seeks to break down disciplinary boundaries by organizing and framing the types of questions addressed by different constituencies so that they form a coherent body of knowledge, while still maintaining the unique contributions of each. In essence, our goal is to deconstruct the vast and disconnected literature with the goal of reconstructing it in this thorough and unified handbook.
Even with a substantial literature in this area, the answers to methodological questions that are fairly well-established for methods such as regression are much less definitely addressed for MLMs. For example, in regression, we have a fairly clear set of approaches to missing data, graphics and diagnostic methods, and matters of robustness and model selection. For MLMs, each of these poses a fairly unique challenge. For example, there is a substantial literature on [Page xxiv]pattern missingness that is crucial to the analysis of MLMs, while multiple imputation techniques for MLMs are less well developed and neither approach has fully penetrated mainstream software packages. This volume will gather the most current innovations in such areas and establish through both theory and practical example how to approach the variety of technical concerns that arise.
This volume is aimed at the Masters/Ph.D. level researcher, whether a statistician or a researcher in an applied field. We hope that it appeals to researchers who might have some exposure to the basics of multilevel modeling (as can be found in the many excellent introductory texts about MLMs that are available), but who are facing new research problems or who want to branch out to new fields, and can benefit from the volume's broad coverage, discussion of philosophical issues and controversies in the field, and detailed attention to applications in different fields. We also hope that it can serve researchers who wish to understand where their discipline-specific models fit in the broader universe of methods and models. We reiterate that each discipline and approach provides unique insights and perspectives.
We are pleased to have a truly impressive set of authors of chapters. They include many of the leading lights in the field, and many are fundamental contributors to the topics about which they have written. They bring a broad perspective to the material, representing eleven countries and four continents, coming from academia, research institutions, industry, and government, with backgrounds and experience in fields including biostatistics, economics, education, marketing, political science, psychology, and statistics.
Material related to the book can be found at its associated website, which provides access to the data sets used in the book, examples of computer code for performing analyses in the book, and an errata list. The reader is encouraged to visit http://people.stern.nyu.edu/jsimonof/MultilevelHandbook/.
We would like to thank Patrick Brindle for bringing this project to us, and thank Anna Horvai and him for their help in bringing it to fruition. We would also like to thank the authors of the chapters for sharing their time and expertise cheerfully and willingly. We would like to thank Jeff Gill, Mark Handcock, Jennifer Hill, Thomas Kneib, Charles McCulloch, Geert Molenberghs, John Neuhaus, Ardo van den Hout, and Geert Verbeke in particular for their editorial assistance in the preparation of this volume. We are hopeful that readers will benefit from reading this book, and are especially confident regarding that, given how much we have learned just from editing it. Finally, we would like to thank our families for their love and support.MarcA.Scott, New York, New YorkJeffreyS.Simonoff, New York, New YorkBrianD.Marx, Baton Rouge, Louisiana
Multilevel Modeling[Page xxv]JeffreyS.SimonoffNew York University, USAMarcA.ScottNew York University, USABrianD.MarxLouisiana State University, USAIntroduction
In this chapter we provide some introductory background for the chapters to come. We start with discussion of how multilevel modeling can be viewed as a natural generalization of three different types of models: regression, analysis of variance, and time series models. We then note how algorithmic advances and the introduction of statistical software has had a particularly strong effect on the development of multilevel modeling. This is followed by a description of the design and goals of the book, and we close with some discussion of what is not covered here, including future possibilities, applications, and opportunities.Regression, Anova, and Time Series Models
Regression analysis is undoubtedly the most widely used statistical method. The standard regression model
with ∊ forming a random sample from a Gaussian population with mean 0 and variance σ2, despite its simplicity, is a remarkably powerful and flexible tool for data summary and prediction. Least-squares-based methods are well understood, and possess well-known optimality properties. Of course, when the assumptions underlying (1) do not hold, these favorable characteristics do not necessarily hold, and the model and estimation scheme need to be changed. One such situation is when observations arise in a nested or clustered fashion.
Such data occur often in practice. For example, in an educational setting, information and responses might be available at the level of each student, but an experimental condition might be administered at the classroom level. Since all of the students within a class are exposed to the same classroom conditions (that is, students are nested within classrooms), it would be expected that the errors ∊i for students from the same classroom would be correlated with each other (reflecting unmodeled differences [Page xxvi]between classrooms, such as the effect of the teacher), a violation of the standard regression model assumptions.
Such nesting need not only occur just at one level, as students can be nested within classrooms, which are nested within schools, which are nested within school districts, and so on. Further, in a situation like this the importance of each level of nesting could be different for different users. Effects at the level of an individual student are likely to be of greater interest to students and teachers, but from a policy point of view relationships at the school or district level might be a more important focus. What is needed is a way to generalize (1) to allow for this nested structure.
Consider now a simple experimental setting, where the response of interest is a numerical measure of patient health such as blood pressure level. The purpose of the experiment is to explore the relationship between blood pressure and several types of physical and mental exercise, such as yoga, meditation, aerobic exercise, and so on. In this situation a natural representation of the relationships in the data is the one-way analysis of variance (ANOVA) model
where yij is the response for the ith observation in the jth exercise group (for example, yoga) and αj is the fixed effect of being in the jth group. This is, in fact, a special case of the regression model (1), with the predictors being effect codings that code the groups (Chatterjee and Simonoff, 2013, Section 6.3.1). It is well known that for this model the best estimated expected response for any member of the jth group is , the mean response for observations from that group.
Imagine instead a situation where a specific treatment regimen is applied to patients in different treatment facilities, with each treatment facility defining a group. In this situation it is reasonable to view the set of facilities as constituting a random sample from the population of treatment facilities. A model that is consistent with this is
where uj is the random effect of being in the jth group (treatment facility), and is a Gaussian random variable with mean 0 and variance σ2u that is independent of ∊.
The key distinction between models (2) and (3) is that the fixed effects model (2) has only one source of variability (∊ij, with variance σ2), while the random effects model (3) has two components of variance (∊ij and uj, with variances σ2 and σu2, respectively). These two sources of variability reflect the variability between observations that are in the same group, measured by the within-group variance σ2, and the variability between observations that are in different groups, measured by the between-group variance σu2. A single number that summarizes the relative importance of the two variance components is the intraclass correlation (ICC)
which is the proportion of variability accounted for by the variability between groups. It is also equal to the correlation between two observations within the same group, reinforcing the correspondence of variability between groups and correlation within groups.
It can be shown that in this situation unbiased estimates of u satisfy
(Gelman and Hill, 2007, p. 253). Note that, in practice, estimates of the variance components are necessary to construct these estimates. Equation (4) shows that the random effects model provides a sensible compromise between the fixed effects strategy of treating each group separately (yielding as the predicted y for all members of the jth group) and the null strategy of assuming that there are no group effects at all (yielding the overall mean as the predicted y for all observations). Gelman and Hill (2007, Section 12.2) refer to this “borrowing strength” from other [Page xxvii]groups as partial pooling, reflecting a compromise between no pooling and complete pooling. The compromise is “sensible” because of the form of the weights in (4). In particular, groups with larger sample sizes are given more weight, and the smaller ρ is (and hence the smaller the between-group variability is relative to the total variability), the more relative weight is given to complete pooling. As Robinson (1991) notes, these notions are related to the concept of regression to the mean, as they imply that the best estimate of a characteristic of an offspring is based on parental averages that are then regressed (shrunk) towards the population mean.
The weakness of model (3) is that it does not allow for the possibility of more complex fixed-effects structure or more complex random-effects structure. What is needed is a way to generalize (3) to allow for these more complex possibilities.
Another way the assumptions of the standard regression model (1) can be violated is if there is a time ordering to the observations, resulting in autocorrelation of the errors. That would, of course, require the application of time series methodology, of which there is a vast literature.
A more challenging situation is when data have both a cross-sectional and time series structure; that is, when individual units are repeatedly sampled at different points in time. Such data are called longitudinal data. They are a special case of repeated measures data, in which the repeated measurements for an individual unit are of the response at different time points (rather than, for example, under different experimental conditions). In economic data the term panel data is typically used for this type of situation.
Since the measurements at different time points are taken on the same individual, the issues of an induced correlation of observations within each individual noted earlier (measured by the ICC ρ) still apply. In addition to this, however, the possibility of autocorrelation of errors within an individual must also be considered. This suggests the use of multiple time series models, such as are described in Lütkepohl (2005), including vector autoregressive and moving average processes, cointegrated processes, multivariate ARCH and GARCH processes, among others.
The weakness of these models is that they are designed for the situation where there are a relatively small number of series, measured at a relatively large number of time points. Longitudinal data are typically the opposite situation—a relatively large number of individuals measured at a relatively small number of time points. What is needed are models that are designed for the latter situation. The models need to be flexible enough to handle when subject-level inference is of less interest and time effects are less important (as is often true in longitudinal data), when subjects might constitute the entire population (implying the use of fixed rather than random effects), and complex time structures reflecting, for example, the occurrence of seasonal and shock effects (as is often true for economic panel data).
All three of these scenarios (generalizing regression, generalizing random effects ANOVA, and generalizing time series models) lead directly to the use of multilevel models. The natural generalization of models (1), (2), and (3) is
the random intercepts model (since β0j ≡ β0 + u0j defines different random intercepts for each group). This can be generalized further by allowing for different types of time-related structure in the joint distributions of u or ∊ (or both). The random intercepts model (5) can also be generalized to allow for more complex random effects, such as random slopes and multiple levels of nesting.
Since multilevel models generalize regression, ANOVA, and time series models, lessons from each of those classes of models still apply. In particular, preliminary graphical examination of data, diagnostic measures and tests of assumptions, transformation of variables, and so on, all have their role in [Page xxviii]multilevel data analysis. Each of these issues will be discussed in future chapters.The Development of Multilevel Modeling
Many of the succeeding chapters discuss the historical development of different types of multilevel model, so we will only briefly provide some context here. More detailed discussion of this can also be found in de Leeuw and Kreft (1986), Engel (1990), and Robinson (1991). As the discussion in the previous section would suggest, the different ways that multilevel models generalize more basic models corresponds to their adoption at different times in different subject areas.
The earliest examples come from the use of random effects in agricultural experiments. The original fundamental work is due to R.A. Fisher, described in Chapter 7 of the 1925 edition of his Statistical Methods for Research Workers (Fisher, 1925). Eisenhart (1947) built on this work by explicitly considering the distinction between fixed and random effects. The best linear unbiased predictor (shrinkage) technique was introduced by Henderson (1950) in the context of animal breeding. Lindley and Smith (1972) demonstrated the close connection of this technique to Bayes and Empirical Bayes methods, and Efron and Morris (1975) discussed the connection to James–Stein estimation.
Models with random parameters have long been a part of the econometric treatment of panel data. Early examples include Hildreth and Houck (1968), Swamy (1970, 1971), Hsiao (1974, 1975), Cooley and Prescott (1976), and Amemiya (1978). MaCurdy (1981) explicitly discussed the connection between multiple time series models and the analysis of panel data.
The recognition that the use of regression models that ignore hierarchical structures can lead to incorrect inferences in Aitkin et al. (1981) had a profound effect on educational research in the 1980s. The naturally hierarchical structure of educational data noted in the previous section resulted in a good deal of research on the application of the hierarchical linear model to educational data, including Burstein (1980), Goldstein (1985, 1987), Aitkin and Longford (1986), de Leeuw and Kreft (1986), Raudenbush and Bryk (1986), Raudenbush (1988), and Bryk and Raudenbush (1992).
Laird and Ware (1982) extended earlier work on growth curves (Rao, 1965; Fearn, 1975) and repeated measures (Harville, 1977) to linear random effects models for longitudinal data. This work has provided the basis for many extensions, including to binary data (Stiratelli et al., 1984), nonlinear models (Lindstrom and Bates, 1990), missing and censored data (Brown, 1990), nonparametric regression (Rice and Wu, 2001), and regression trees (Hajjem et al., 2011; Sela and Simonoff, 2012).
Advances in computing in general, and computer software packages in particular, have had a very strong effect on the development of multilevel modeling. The early ANOVA-based algorithms required balanced designs; adaptations to unbalanced data lose the benefit of exact distributional results and can lead to negative variance estimates (Searle, 1971, 1987). Harville (1977) discussed maximum likelihood and restricted maximum likelihood estimation. Laird and Ware (1982) described the use of the EM (expectation–maximization) algorithm of Dempster et al. (1977) to calculate these estimates, and the HLM software package introduced in 1988 (Bryk et al., 1988) was based on this approach. The uses of iterative generalized least squares (Goldstein, 1986) and Newton–Raphson and Fisher scoring algorithms (Jennrich and Schluchter, 1986) to conduct likelihood-based inference also date from this time, with the former method being the basis of ML2 (Rasbash et al., 1989) and its successors such as MLwiN (Rasbash et al., 2004), the latter method being the basis of BMDP 5V (Schluchter, 1988). The availability of these packages greatly expanded the applicability of multilevel models in the late 1980s.[Page xxix]
In the 1990s and 2000s these packages (which were specifically designed for multilevel models, other than BMDP 5V) were joined by general-purpose packages with modules or functions designed for multilevel modeling, including SAS, R, SPSS, and Stata. These functions were eventually expanded to include coverage of nonlinear mixed models and generalized linear mixed models for non-Gaussian data. Chapter 26 discusses these and other packages in more detail.
The computational revolution in Bayesian analysis brought about by Markov Chain Monte Carlo methods has also had a strong effect on multilevel modeling, which is unsurprising given the natural Bayesian interpretation of random effect distributions as prior distributions. Several packages include these capabilities, including MLwiN and R.The Design and Goals of this Book
The chapters are divided into four parts, the first three of which involve aspects of model development and the last of which emphasizes applications. The subject of model development is viewed in a very general way, including, in addition to methodology, aspects of the history of multilevel modeling, design and data collection, and software. Some chapters highlight innovative cutting-edge methodology while also acknowledging and addressing current controversies in the field. To the extent possible, the chapters utilize a common set of notation, and a chapter is dedicated to establishing the connections and equivalence of the most common notations used in practice.1 The applications chapters highlight both generality and generalizability, thereby encouraging different communities of researchers to learn from the research programs of others; that is, the centerpiece of each of the chapters is the underlying methodology.
Part I of the Handbook, Multilevel Model Specification and Inference, is a self-contained overview of multilevel modeling, which includes topics of key interest to practitioners, such as design considerations and causal inference, that are sometimes available only through more specialized outlets. The overall framework for multilevel modeling is provided in Chapter 1, including a concise introduction to different types of multilevel model and their associated terminology. This is followed by a chapter on notation, which establishes the rules and rationale guiding notation one finds in practice, and discusses different contexts that would recommend one notational form over another. Nearly all chapters utilize a form discussed in this chapter, and when they deviate, specific rationale is provided.2 Frequentist and Bayesian approaches to estimation and inference are developed next in Chapters 3 and 4, respectively.
Chapters 5 and 6 are two perspectives on what at first appear to be unrelated issues: placing distributional assumptions on group effects and centering predictors. The former emphasizes the modeling assumptions and their implications for interpretation, wherein it is revealed that between-group relationships and their within-group analog are implicitly tied to these assumptions through estimation choices. The chapter ends with a discussion of centering that potentially unifies what has often been viewed as a contentious topic. This chapter is followed by a full chapter on centering, which exposes the underlying complexity of the choice with an emphasis on the implicit research questions that result. The overarching theme is whether effects can be characterized as between- or within-subjects, tying the discussion to the issues presented in the prior chapter. Chapter 7 discusses how the standard approaches to model selection and measures of fit require significant modification in the realm of multilevel modeling, particularly in the area of accounting for degrees of freedom. Examples provide further insight as to what may be appropriate in different multilevel contexts (e.g., nested versus panel data).
At this point in the presentation, the reader has most of the tools needed to model Gaussian outcomes, so Chapter 8 provides an overview of the extension of generalized [Page xxx]linear models to include multilevel structure. Following this is a chapter on longitudinal data models. It provides a thorough introduction to these models, including the somewhat unique considerations of time-series modeling and loss to follow-up. It also includes a discussion of generalized linear mixed models and estimation using generalized estimating equations.
Returning to the theme of re-examining standard approaches and assumptions, Chapter 10 discusses a variety of error structures that fill the spectrum between simpler compound symmetry (or random intercept) models and fully unstructured covariance. The first section ends with a chapter on design considerations and one on causal inference. Optimal design is especially important in grouped or multilevel data, where the group sizes may be somewhat small, and this chapter covers important designs such as the cluster randomized trial. As many research questions involve binary outcomes, this situation is included in the discussion. Chapter 12 tackles the challenge of causal inference and is a thorough overview of the additional considerations one must confront with observational as well as randomized studies in the multilevel setting. This chapter builds on material from Chapter 5.
The next part of the Handbook develops methodology for handling variations and extensions of multilevel models. While some of the chapters may appear to be quite specialized, many practitioners are trying to draw inference from increasingly complex data structures and the relationships they represent. In other words, practitioners increasingly find themselves facing new types of multilevel data and their research questions require new methodology. Chapter 13 on multilevel functional data analysis (MFDA) is an excellent example of the “outcome” variable being more complex than a continuous scalar measurement. In MFDA the outcome is a function, perhaps indexed by time or space. In a similar vein, Chapter 21 extends multilevel models to accommodate multiple measures on the same subject when they are nested, by group or time. The chapter notably does not limit the outcomes to be Gaussian. Just as outcomes can be more complex, the models themselves may be; the natural extension of linear models to nonlinear models can also incorporate random effects and thus multilevel structure, and this is the topic of Chapter 14. Perhaps the more commonly known nonlinear multilevel models, generalized linear mixed models (GLMMs), are discussed next; these were introduced in Part I of the Handbook, but now estimation methods are given in greater detail. In most discussions of GLMMs, binary and count outcomes are developed quite thoroughly, while categorical outcomes with more than two levels are not. Yet multilevel categorical data are quite common among practitioners, so we dedicate the subsequent chapter to this topic exclusively.
Chapters 17–20 extend features of multilevel models that some practitioners take for granted. For example, while linear models can capture a variety of mean structures, nonparametric approaches allow one to more closely capture an underlying relationship. Semiparametric models for the mean may be more realistic and may improve the modeling of covariance as well. Chapter 17 develops a framework for nonparametric extensions of mean and covariance structures. Chapter 18 develops related semiparametric models for the mean using a roughness penalty approach that is a multilevel analog of generalized additive models (Hastie and Tibshirani, 1990). Chapter 19 returns to the linear model, but now allows the predictor effects to vary with time, all within the multilevel framework. Chapter 20 extends the class of random effects distributions available to multilevel modelers using a latent class approach. The simplest way to characterize this extension is, “what if you wished to draw your random effects from a mixture of Gaussian densities?” This chapter merges ideas from latent growth curve modeling (Jones and Nagin, 2007) and model-based clustering (Banfield and Raftery, 1993) to form a coherent framework for more realistic longitudinal data modeling.[Page xxxi]
Having established both the general framework and extensions of multilevel models, Part III of the Handbook covers model fitting and specification issues. For example, Chapter 22 begins with a general discussion of robustness and then covers essential topics such as robust standard error estimates, bootstrapping, and using Bayesian approaches (e.g., replacing normal distributions with t- or Cauchy distributions). All of these ideas are developed with the particular features of multilevel modeling in mind. Missing data are quite common in applied statistical analyses, and Chapter 23 provides a comprehensive review of approaches to dealing with missingness. After defining the types of missingness that may occur, the authors discuss historically common approaches such as last observation carried forward to multiple imputation and then pattern mixture models. Next, Chapter 24 offers a detailed discussion of the assumptions in multilevel models and how they can be assessed. The chapter emphasizes important graphical techniques and clearly describes the limitations inherent in some approaches. Chapter 25 offers a unique perspective on a substantive concern—endogeneity—and an estimation technique known as Generalized Estimating Equations (GEE). The perspective offered, that GEE is potentially robust with respect to this issue, is examined closely via a simulation study. The chapter has ties to many chapters in the book, including those emphasizing generalized linear models, the choice between fixed and random effects, and causality (Chapters 15, 5, and 12, respectively).
We close this part of the Handbook with a practical discussion of software packages in Chapter 26. In particular, the authors emphasize the connection between estimation methods and software to implement them, as well as specialty software for more recently introduced multilevel models. For example, there is discussion of generalized additive mixed models, models for dyads or social networks, and implementation of complex survey weights, to name a few.
Part IV of the Handbook contains selected applications that illustrate domains of inquiry making novel use of multilevel models. For example, Chapter 27 describes how meta-analysis clearly benefits from multilevel modeling approaches; what is less apparent is that the “level one units” consist of study findings and the within-study uncertainty, but not the within-study subject responses. In Chapter 28, the author presents an overview of the ways in which policy analysts utilize multilevel models, and points out several considerations that are both common in that field and commonly overlooked. Examples include micro-numerosity and spatial correlation. Coverage of these topics is broad enough that this chapter serves as a useful springboard for delving more deeply in chapters covering fixed versus random effects, causality, and spatial modeling (Chapters 5, 12 and 31, respectively). Chapter 29 offers a broader overview than its predecessor, providing insight into the opportunities and pitfalls that accompany multilevel modeling in the social and behavioral sciences. This chapter might be read immediately after the first few chapters in the Handbook.
The last four chapters are similar in spirit to Part II of the Handbook in that they involve extending an established methodology to accommodate multilevel data and structures. Chapter 30 discusses the frailty model in survival analysis—perhaps one of the earliest extensions of an existing class of models to handle a form of nesting; data of this nature are quite common in the medical sciences, but the method has broad applicability. In some problems, the unit of analysis is a location in space (e.g., weather). When multiple measures are collected at specific points, we have grouped data, but observations across groups are correlated due to spatial proximity. Multilevel models for such data must thus include models for several forms of covariance. Chapter 31 discusses these, complementing the general treatment of multivariate outcomes given in Chapter 21. An independent area of research of its own, structural equation models are particularly useful for capturing relationships between latent variables, making them popular in the behavioral sciences. In Chapter 32, our [Page xxxii]market research application, the author develops the multilevel structural equation model and applies it to preference data. Similar to spatial data models, network and relational data models represent complex correlation structure, but they must represent relationships on graphs, which is distinct from nearly every other topic discussed in this book (one way to see this is to consider network data to be a realization of an adjacency matrix of relations). Chapter 33 applies multilevel models for friendship networks of students nested in classrooms. These last four chapters are excellent examples of how the addition of multilevel structures improves our understanding of complex human behaviors.What isn't in this Book
Although we have made a genuine effort to leave “no stone unturned,” there remain some multilevel modeling topics that are either not fully covered or completely absent from this handbook, and this section surveys some of these missing topics. Broadly speaking, these coverage gaps can be delineated into two camps: ones due to space limitations and areas of future research.
We start with “big data,” which has become a gigantic issue in nearly all areas of statistics, and multilevel modeling is no exception. With such data, computational issues clearly become increasingly important, especially in combination with complex models. As a case in point, Strandén and Lidauer (1999) have addressed real time and applied challenges in solving large mixed models on continuous evaluation of dairy cattle with random regression, which requires both a fast algorithm and solution method. These authors cleverly used conjugate gradient iterative methods, efficiently calculating the multiplication of a vector with a matrix in (three) reordered steps. Generalized linear mixed models (GLMM) require additional attention to computation speed. Ver Hoef and London (2010) considered pseudo-GLMMs, with many binary repeated measures on each subject, that have computational time that increases linearly with the number of observations. These authors impressively tackled computational issues for such binary time series, containing over 100 fixed effects, 50 random effects parameters, and 1.5 × 105 observations. In other related work, Alam (2010) developed a two-step pseudo-likelihood estimation technique for the GLMM that is more computationally efficient than conventional algorithms. This author specifically addressed models having random effects that are (weakly) correlated between groups, while presenting a binary response application to Swedish credit default data. More recently, Schelldorfer and Bühlmann (2011) developed a lasso-type approach useful for variable screening in high-dimensional GLMMs. The approach is based on an ℓ1-penalized algorithm, and is implemented in the R package glmmlasso. Also related to rich data are problems that contain intensive longitudinal data. A portion of Chapter 13 is dedicated to such intensive data, where in some cases outcomes take the form of a curve for each subject. We find commonality in all of the above research efforts: efficient computation in rich multilevel modeling settings.
Another approach useful for handling “big data,” especially when (generalized) linear mixed models struggle to track the data structure, is the use of regression trees. Tree-based models are based on a recursive partitioning of the data and have strengths and popularity for several reasons: they are more adept at (automatically) capturing nonadditive or interactive structure, can also easily handle a mix of numeric covariates and factors, while being invariant to (monotone) re-expressions of the response. Extensions have been put forth into hierarchical settings, using the EM algorithm. Hajjem et al. (2011) developed a mixed effects regression tree for nested data within clusters. Sela and Simonoff (2012) developed methodology that combines the structure of mixed effects models for both longitudinal and clustered data, with the flexibility of tree-based estimation methods. The strength of using a tree-based approach becomes apparent in [Page xxxiii]this setting as this mixed effect approach, for example, can handle unbalanced clusters, allows clusters to be split, and can incorporate random effects and also observation-level covariates.
More in the spirit of “enormous data,” Chapter 33 of this handbook develops multilevel models for social networks. It is natural to wonder how such statistical models can be used and extended with massive network data, such as that from social media platforms like Facebook and Twitter. Although the models described in Chapter 33 have demonstrated effectiveness in describing and predicting smaller networks, their complexity will grow exponentially for networks with millions of individuals and potentially hundreds of millions of ties between those individuals.
Another area that is not extensively covered here is marginal structural models, which are an alternative to structural nested models, and thus relevant to our handbook. Using observational data, Robins et al. (2000) developed a new class of causal models for the (consistent) estimation of the causal effect of, e.g., a time-dependent exposure in the presence of time-dependent covariates, which may additionally be simultaneously confounders and intermediate variables. Chapters 5 and 12 of this handbook do contain some discussion of marginal structural models, but not in any great detail.
We do not cover problems related to small area estimation (SAE) in multilevel models in this volume. Essentially, the SAE problem is to estimate unknown parameters (e.g., the mean or quantiles) for areas in which only small (or no) samples are available. Precision of estimation is also of interest in this context. Pfeffermann (2012) has compiled a contemporary review containing important developments in SAE during the last decade, and he considers both design-based and modeldependent approaches (the latter with both frequentist and Bayesian methods). A selection of papers that are specifically related to small area estimation in the multilevel models setting include Pfeffermann et al. (1998), Feder et al. (2000), and Pfeffermann et al. (2008). An issue related to SAE is that of weighting for complex survey design. Rabe-Hesketh and Skrondal (2006) nicely presented the topic of multilevel modeling of complex survey data, and the references therein contain, more generally, a broad coverage of the role of sampling weights.
Lastly, coverage of measurement error (quantitative variables) and classification error (categorical variables) is absent from this handbook. Fox and Glas (2002) devoted an entire chapter on the modeling of measurement error in structural multilevel models, which is a nice overview and also provides many references. Somewhat more recently, Goldstein et al. (2008) developed models that adjusted for measurement errors in normally distributed predictors and response variables and categorical predictors with misclassification errors. Using Markov Chain Monte Carlo, the models allowed for hierarchical data structure and for correlations among the errors and misclassifications. See also Chapter 14 of Goldstein (2011).Notes
1 Some chapters are more technically sophisticated than others, sometimes relying heavily on matrix notation. The reader should consult Chapter 2 which, in addition to introducing that notation, provides some references for deeper understanding and further examples.
2 A good example of such deviation is Chapter 12 on causal inference, which utilizes the potential outcomes notation discussed in Rubin (1978), modified to incorporate a multilevel model framework.References1981] ‘Statistical Modeling of Data on Teaching Styles’, Journal of the Royal Statistical Society, Series A, 144: 419–61. http://dx.doi.org/10.2307/2981826, , and  ‘Statistical Modeling Issues in School Effectiveness Studies’, Journal of the Royal Statistical Society, Series A, 149: 1–43. http://dx.doi.org/10.2307/2981882and [[Page xxxiv]2010] ‘Feasible Estimation of Generalized Linear Mixed Models [GLMM] with Weak Dependency Between Groups’, http://oru.diva-portal.org/smash/get/diva2:389313/FULLTEXT01, accessed September 14, 2012]. ‘A Note on a Random Coefficients Model’, International Economic Review, 19: 793–6. http://dx.doi.org/10.2307/2526342 ‘Model-based Gaussian and Non-Gaussian Clustering’, Biometrics, 49: 803–21. http://dx.doi.org/10.2307/2532201and  ‘Protecting Against Nonrandomly Missing Data in Longitudinal Studies’, Biometrics, 46: 143–55. http://dx.doi.org/10.2307/2531637 Hierarchical Linear Models. Newbury Park, CA: Sage.and  An Introduction to HLM: Computer Program and Users Manual. Chicago: University of Chicago Dept. of Education., , , and  ‘The Analysis of Multi-Level Data in Educational Research and Evaluation’, Review of Research in Education, 8: 158–233. Handbook of Regression Analysis. Hoboken, NJ: John Wiley and Sons.and  ‘Estimation in the Presence of Stochastic Parameter Variation’, Econometrica, 44: 167–84. http://dx.doi.org/10.2307/1911389and  ‘Random Coefficient Models for Multilevel Analysis’, Journal of Educational Statistics, 11: 57–85. http://dx.doi.org/10.2307/1164848and  ‘Maximum Likelihood with Incomplete Data via the E-M Algorithm’, Journal of the Royal Statistical Society, Series B, 39: 1–38., , and  ‘Data Analysis Using Stein's Estimator and Its Generalizations’, Journal of the American Statistical Association, 70: 311–9. http://dx.doi.org/10.1080/01621459.1975.10479864and  ‘The Assumptions Underlying the Analysis of Variance’, Biometrics, 3: 1–21. http://dx.doi.org/10.2307/3001534 ‘The Analysis of Unbalanced Linear Models With Variance Components’, Statistica Neerlandica, 44: 195–219. http://dx.doi.org/10.1111/j.1467-9574.1990.tb01282.x ‘A Bayesian Approach to Growth Curves’, Biometrika, 62: 89–100. http://dx.doi.org/10.1093/biomet/62.1.89 ‘Multilevel Modeling of Complex Survey Longitudinal Data with Time Varying Random Effects’, Survey Methodology, 26: 53–65., , and  Statistical Methods for Research Workers. Edinburgh: Oliver and Boyd. ‘Modeling Measurement Error in a Structural Multilevel Model’, in Marcoulides, G.A. and Moustaki, I. [eds], Latent Variable and Latent Structure Models, London: Lawrence Erlbaum Associates, 245–69.and  Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press.and  Multilevel Statistical Models. New York: Halstead Press. ‘Multilevel Mixed Model Analysis Using Iterative Generalized Least Squares’, Biometrika, 73: 43–56. http://dx.doi.org/10.1093/biomet/73.1.43 Multilevel Models in Education and Social Research. Oxford: Oxford University Press. Multilevel Statistical Models,[4th ed.Chichester: John Wiley and Sons.2008] ‘Modeling Measurement Errors and Category Misclassifications in Multilevel Models’, Statistical Modeling, 8: 243–61. http://dx.doi.org/10.1177/1471082X0800800302, , and  ‘Mixed Effects Regression Trees for Clustered Data’, Statistics and Probability Letters, 81: 451–9. http://dx.doi.org/10.1016/j.spl.2010.12.003, , and  ‘Maximum Likelihood Approaches to Variance Component Estimation and to Related Problems’, Journal of the American Statistical Association, 72: 320–40. http://dx.doi.org/10.1080/01621459.1977.10480998 Generalized Additive Models. Boca Raton, FL: Chapman & Hall/CRC.and  ‘Estimation of Genetic Parameters [abstract]’, Annals of Mathematical Statistics, 21: 309–10. ‘Some Estimators for a Linear Model with Random Coefficients’, Journal of the American Statistical Association, 63: 584–95. http://dx.doi.org/10.2307/2284029and  ‘Statistical Inference for a Model With Both Random Cross-Sectional and Time Effects’, International Economic Review, 15: 12–30. http://dx.doi.org/10.2307/2526085 ‘Some Estimation Methods for a Random Coefficients Model’, Econometrica, 43: 305–25. http://dx.doi.org/10.2307/1913588 ‘Unbalanced Repeated-Measures Models With Structured Covariance Matrices’, Biometrics, 42: 805–20. http://dx.doi.org/10.2307/2530695and  ‘Advances in Group-Based Trajectory Modeling and an SAS Procedure for Estimating Them’, Sociological Methods Research, 35: 542–71. http://dx.doi.org/10.1177/0049124106292364and  ‘Random Effects Models for Longitudinal Data’, Biometrics, 38: 963–74. http://dx.doi.org/10.2307/2529876and  ‘Bayes Estimates for the Linear Model’, Journal of the Royal Statistical Society, Series B, 34: 1–41.and  ‘Nonlinear Mixed Effects Models for Repeated Measures Data’, Biometrics46: 673–87. http://dx.doi.org/10.2307/2532087and [[Page xxxv]2005] New Introduction to Multiple Time Series Analysis. Berlin: Springer. ‘Multiple Time-Series Models Applied to Panel Data’, NBER Working Paper No. 646 [http://www.nber.org/papers/w0646, accessed August 27, 2012]. ‘Weighting for Unequal Selection Probabilities in Multilevel Models [with discussion]’, Journal of the Royal Statistical Society, Series B, 60: 23–56. http://dx.doi.org/10.1111/1467-9868.00106, , , , and  ‘Small Area Estimation Under a Two-Part Random Effects Model with Application to Estimation of Literacy in Developing Countries’, Survey Methodology, 34: 233–47., , and  ‘New Important Developments in Small Area Estimation’, Statistical Science, 27: to appear. ‘Multilevel Modeling of Complex Survey Data’, Journal of the Royal Statistical Society, Series A, 169: 805–27. http://dx.doi.org/10.1111/j.1467-985X.2006.00426.xand  ‘The Theory of Least Squares When the Parameters Are Stochastic and its Application to the Analysis of Growth Curves’, Biometrika, 52: 447–58. ML2: Software for Two-Level Analysis. London: Institute of Education, University of London., , and  A User's Guide to MLwiN, Version 2.0. Bristol, UK: Centre for Multilevel Modeling., , , and  ‘Education Applications of Hierarchical Linear Models: A Review’, Journal of Educational Statistics, 12: 85–116. http://dx.doi.org/10.2307/1164748 ‘A Hierarchical Model for Studying School Effects’, Sociology of Education, 59: 1–17. http://dx.doi.org/10.2307/2112482and  ‘Nonparametric Mixed Effects Models for Unequally Sampled Noisy Curves’, Biometrics, 57: 253–9. http://dx.doi.org/10.1111/j.0006-341X.2001.00253.xand  ‘Marginal Structural Models and Causal Inference in Epidemiology’, Epidemiology, 11: 550–60. http://dx.doi.org/10.1097/00001648-200009000-00011, , and  ‘That BLUP Is a Good Thing: Estimation of Random Effects’, Statistical Science, 6: 15–51. http://dx.doi.org/10.1214/ss/1177011926 ‘Bayesian Inference for Causal Effects: The Role of Randomization’, The Annals of Statistics6: 34–58. http://dx.doi.org/10.1214/aos/1176344064 ‘GLMM-Lasso: An Algorithm for High-Dimensional Generalized Linear Mixed Models Using ℓ1-Penalization’, [http://arxiv.org/abs/1109.4003, accessed September 6, 2012].and  BMDP 5V: Unbalanced Repeated Measures: Models with Structured Covariance Matrices. BMDP Statistical Software. Linear Models. New York: John Wiley and Sons. Linear Models for Unbalanced Data. New York: John Wiley and Sons. ‘RE-EM Trees: A Data Mining Approach for Longitudinal and Clustered Data’, Machine Learning, 86: 169–207. http://dx.doi.org/10.1007/s10994-011-5258-3and  ‘Random-Effects Models for Serial Observations With Binary Response’, Biometrics, 40: 961–71. http://dx.doi.org/10.2307/2531147, , and  ‘Solving Large Mixed Linear Models Using Preconditional Conjugate Gradient Iteration’, Journal of Dairy Science, 82: 2779–87.and  ‘Efficient Inference in a Random Coefficient Regression Model’, Econometrica, 38: 311–23. http://dx.doi.org/10.2307/1913012 Statistical Inference in Random Coefficients Regression Models. New York: Springer. http://dx.doi.org/10.1007/978-3-642-80653-7 ‘Fast Computing of Some Generalized Linear Mixed Pseudo-Models with Temporal Autocorrelation’, Computational Statistics, 25: 39–55. http://dx.doi.org/10.1007/s00180-009-0160-1and [[Page xxxvi]