‘The editors of the new SAGE Handbook of Regression Analysis and Causal Inference have assembled a wide-ranging, high-quality, and timely collection of articles on topics of central importance to quantitative social research, many written by leaders in the field. Everyone engaged in statistical analysis of social-science data will find something of interest in this book.’

- John Fox, Professor, Department of Sociology, McMaster University

‘The authors do a great job in explaining the various statistical methods in a clear and simple way - focussing on fundamental understanding, interpretation of results, and practical application - yet being precise in their exposition.’

- Ben Jann, Executive Director, Institute of Sociology, University of Bern

‘Best and Wolf have put together a powerful collection, especially valuable in its separate discussions of uses for both cross-sectional and panel data analysis.’

-Tom Smith, Senior Fellow, NORC, University of Chicago

Edited and written by a team of leading international social scientists, this Handbook provides a comprehensive introduction to multivariate methods. The Handbook focuses on regression analysis of cross-sectional and longitudinal data with an emphasis on causal analysis, thereby covering a large number of different techniques including selection models, complex samples, and regression discontinuities.

Each Part starts with a non-mathematical introduction to the method covered in that section, giving readers a basic knowledge of the method's logic, scope and unique features. Next, the mathematical and statistical basis of each method is presented along with advanced aspects. Using real-world data from the European Social Survey (ESS) and the Socio-Economic Panel (GSOEP), the book provides a comprehensive discussion of each method's application, making this an ideal text for PhD students and researchers embarking on their own data analysis.

# Chapter 5: Regression Analysis: Assumptions and Diagnostics

### Regression Analysis: Assumptions and Diagnostics

Regression analysis: Assumptions and diagnostics
, and

### Introduction

As shown in the previous chapter, ordinary least squares (OLS) regression links the values of dependent variable Yi(i = 1, 2,…, n) to the values of a set of independent variables Xik by means of a linear function and an error term εi: where k ranges from 0 to p-1. This model thus contains p regression parameters (namely effects of p −1 predictors and one intercept: X' equals 1 for all cases). The linear function is called the linear predictor or the structural part of the model, while the error term is the random or stochastic component of the model. In general, regression analysis can be used for two purposes: (1) to describe the ...

• • • 