# A Conceptual Guide to Statistics Using SPSS

Books

### Elliot T. Berkman & Steven P. Reise

• Chapters
• Front Matter
• Back Matter
• Subject Index
• ## Dedication

EB—I dedicate this book to my parents for encouraging my childhood nerdlihood, and to my wife for continuing to find it endearing.

## Preface

This book grew out of our experiences across many years of teaching introductory statistics to graduate students and advanced undergraduates in psychology. We noticed that our students faced a special set of challenges in learning statistics compared to other topics covered in the psychology curriculum. It was often the case that our students had little or no background in statistics and were consequently unfamiliar with thinking about the world in statistical or probabilistic terms. Even when they were familiar with statistics, our students often just didn't like it. And to make matters worse, in addition to their usual course load, they were also busy completing heavy research expectations (for graduate and honors students) or assisting faculty with their research. The dilemma for these students became how to simultaneously learn the challenging theoretical material taught in statistics class and to come away with the practical computational skills needed to advance their research?

The current text proposes to aid students by drawing clear connections between the theoretical and computational aspects of statistics, emphasizing the importance of understanding theoretical concepts during computation, and demonstrating how and where they fit in to SPSS, an IBM Company*. The text not only demonstrates how to use SPSS to advanced computation but also aids students' understanding of the theoretical concepts by teaching them in another, more practical context.

Our goal in this book is to clearly map the theories and techniques taught in a statistics class to the procedures in SPSS. The text teaches students how to perform standard and advanced statistical tests using both the point-and-click menus and syntax functions and how to integrate the SPSS functions with the statistical theory taught in class. The theoretical foundation underlying each topic are introduced before the computational steps in order to remind students of the logic of each statistical test. In this way, a conceptual link is created between the statistical test and the computational steps, and attention is drawn to test-specific issues. Presenting the material in this way also helps to give students a better understanding of the test output because they know which parameters were used “behind the scenes” in the computation. To better fit the material to the needs of a graduate-level audience, advanced options and variations on each test are discussed, and the syntax commands are presented. This gives students more flexibility in tailoring their analyses to a wide variety of experimental paradigms.

It was impossible to cover all of the many statistical tests offered in the SPSS package. Instead, our goal was to provide coverage on any statistical test that might appear in a peer-reviewed psychology journal article. The text features detailed chapters on the tests most commonly used by psychologists, as well as several newer tests that are increasing in popularity. Each chapter is structured similarly so students familiar with the text will be able to quickly flip open the book to learn a new topic.

The book is organized in parallel to many standard statistics textbooks covering correlation, t-tests, ANOVA and MANOVA, multiple regression, and nonparametric tests. Each chapter begins with a brief conceptual introduction featuring test assumptions and a sketch of the mathematical operations behind a procedure. This is followed by an illustrated and annotated step-by-step guide to computation with references back to the introduction where possible and concludes with a discussion of the output.

Target Audience

This book is intended for anyone who not only wants to know how to use SPSS to compute a variety of statistical tests, but who also wants to understand the reasons behind each step and the conceptual meaning of the output. This includes advanced undergraduates in the social sciences; master's students in psychology, education, economics, public health, biological sciences, and counseling psychology; PhD students in the social sciences; and faculty in all these fields seeking a deeper understanding of SPSS than that offered by the usual step-by-step procedural guides. Most of the examples are drawn from research in social and personality psychology, but the tests used are common across many fields that make use of empirical behavioral data.

This text is sufficiently detailed to serve as a stand-alone guide to SPSS, but also is intended to complement a statistics textbook for a variety of undergraduate and graduate statistics courses in the social sciences. Because we cover topics ranging from t-tests and regression to factor analysis and matrix algebra, and because we describe both basic and advanced features of SPSS for each, we are confident that SPSS users at all levels of expertise will find something new and useful in this book.

Special Features
Behind the Scenes

The Behind the Scenes sections explain the conceptual machinery underlying the statistical tests. In contrast to merely presenting the equations for computing the statistic, these sections describe the idea behind each test in plain language. In writing these sections, we sought to answer, in conceptual terms, the questions, What does SPSS do with your data to transform it into the test statistic? Which parts of the data are important for this calculation? and How does the output relate to the meaning of the test? After that, and only where it is helpful to building a conceptual understanding, we give the equation for the test and explain each part in terms of the idea behind the test. Several Behind the Scenes sections also contain schematic diagrams that are intended to clarify how different patterns of data relate to key ideas in the test. These sections were written specifically for introductory students seeking to make a connection between the ideas taught in a statistics course or textbook and the SPSS procedure.

Connections

The Connections sections use SPSS to demonstrate the equivalence among tests that are often treated as distinct. Particularly for introductory students, the syllabus of a statistics course can seem like a laundry list of unrelated tests. The layout of SPSS also supports this impression by segregating similar tests into different menus. The purpose of the Connections sections is to provide a “bigger picture” perspective by highlighting the conceptual similarities across tests. We do this by showing commonalities within a family of tests (e.g., those based on the general linear model) and by relating entirely different types of tests to each other (e.g., between nonparametric tests and ANOVA-based tests). We also use the Connections sections to point out similarities in the SPSS output across different but related statistical tests.

A Closer Look

The A Closer Look sections feature advanced topics that are beyond the scope of other introductory SPSS books. These sections teach the reader how to use SPSS to compute tests or display output that is can be important to report in a research paper but that SPSS does not compute or display by default. Though the topics are more advanced or specialized, the A Closer Look sections are nonetheless written so that introductory students can understand when and why they might want to use them and that more advanced students can quickly learn how to compute them. Topics covered in A Closer Look sections include custom hypothesis tests among group means in ANOVA, assumption checking in the General Linear Model, and saving predicted scores in multiple regression.

Making the Most of Syntax

In the Making the Most of Syntax sections we describe statistical tests and output options that are exclusive to syntax. These include extensive treatment of custom hypothesis testing in ANOVA, MANOVA, ANCOVA, and regression, and an entire chapter on the advanced matrix algebra functions available only through syntax in SPSS. Our emphasis on the powerful capacity of the syntax functions is unique among introductory SPSS books. In order to help the reader learn how to use syntax in your own research, we provide the general form and also a specific example of each syntax function. As always, we emphasize conceptual understanding by linking the specifics of the syntax functions to the general idea behind the test.

This section also highlights the value of using syntax for all statistical tests even when other options are available. Syntax is the easiest way to rerun statistical tests with slight variations or with different variables. And by describing the syntax corresponding to every topic, this book teaches the reader to create a syntax log that provides a complete record of your data analysis process from data cleaning all the way through to figures for publication.

Data Files

Each of the statistical tests covered here is accompanied by an example data set, and the screenshots and output that are shown in each chapter are based on these data sets. Our intention is that the reader can follow along and practice analysis using these data sets, so we have made the data files available on the book webpage at http://www.sagepub.com/berkman. We hope it is clear from the content of the data sets that they are simulated and intended for illustrative purposes only.

Acknowledgments

We would like to acknowledge the insightful feedback from our brilliant colleagues in statistics education, Emily Falk and Hongjing Lu, as well as the willingness of many of our students to serve as proofreaders and guinea pigs for this book over the last few years. We also appreciate helpful comments from several expert reviewers in the field. They made our jobs easier and improved the book substantially.

*Note: SPSS was acquired by IBM in October 2009.

Elliot T. Berkman is Assistant Professor of Psychology and director of the Social and Affective Neuroscience Laboratory at the University of Oregon. He has been teaching statistics to graduate students using SPSS for the past six years. In that time, he has been awarded the UCLA Distinguished Teaching Award and the Arthur J. Woodward Peer Mentoring Award. He has published numerous papers on the social psychological and neural processes involved in goal pursuit. His research on smoking cessation was recognized with the Joseph A. Gengerelli Distinguished Dissertation Award. He received his PhD in 2010 from the University of California, Los Angeles.

Steven P. Reise is professor, chair of Quantitative Psychology, and codirector of the Advanced Quantitative Methods training program at University of California, Los Angeles. Dr. Reise is an internationally renowned teacher in quantitative methods; in particular, the application of item response theory models to personality, psychopathology, and patient reported outcomes. In recognition of his dedication to teaching, Dr. Reise was named “Professor of the Year” in 1995–96 by the graduate students in the psychology department at UC Riverside, and was awarded the 2008 Psychology Department Distinguished teaching award. Most recently, in recognition of his campus-wide and global contributions, Dr. Reise was awarded the University of California campus-wide distinguished teaching award. Dr. Reise has spent the majority of the last twenty years investigating the application of latent variable models in general and item response theory (IRT) models in particular to personality, psychopathology and health outcomes data. In 1998, Dr. Reise was recognized for his work and received the Raymond B. Cattell award for outstanding multivariate experimental psychologist. Dr. Reise has over 70 refereed publications, including, two Annual Review Chapters, two contributions to American Psychological Association Handbooks, several articles in leading journals such as Psychological Assessment and Psychological Methods, and, finally, along with Dr. Susan Embretson, Dr. Reise has the leading textbook on item response theory called Item Response Theory for Psychologists (2000 and forthcoming). He received his PhD from the Department of Psychology at the University of Minnesota in 1990.

• ## Appendix: General Formulation of Contrasts Using LMATRIX

In Chapter 7, we walked through a single example of how to use the LMATRIX function to compute a few custom contrasts in syntax. But the “LMATRIX” function in SPSS is a powerful tool that can compute nearly any contrast among any combination of group means. In the most general terms, the steps for figuring out the right syntax are as follows:

• Write down the contrast coefficients for each cell in a table like the one shown in the figure. The first factor should be along the rows, and the second factor should be along the columns.

If you have a three-way ANOVA, make separate tables for each level of the first factor, with the second factor levels in the rows and the third factor levels in the columns. (For example, suppose we wanted to look at the gender of the guests in addition to their side and relationship to the couple. Then there would be two 2 × 4 tables like the one shown in the figure, one for males and one for females.)

• Compute the marginal sums across the rows and across the columns. If you have a three-way ANOVA, make a new table that is identical in form to the ones you made for each level of the first factor and contains a sum of all the other tables.

• After the “/LMATRIX =” tag, list out all of the factors and all of the interactions in the same order as they are listed in the GLM and DESIGN tags. For example,

/LMATRIX = IV1 IV2 IV1*IV2

for two factors, or

/LMATRIX = IV1 IV2 IV3 IV1*IV2 IV1*IV3 IV1*IV2*IV3

for three factors.

• Write down the cell values and marginal sums for each term based on the tables you generated in Step 2. With two factors (one with 2 levels and the other with 4 levels), the general form (based on the tables) is
• Remove any term (and its coefficients) if all the coefficients are equal to 0.

For example, suppose we wanted to compute the following contrast based on the data set “Wedding.sav” from Chapters 6 and 7, which tests whether the difference in dancing between co-workers and family on the bride's side is different between male and female guests.

The corresponding syntax is

/LMATRIX = gender*relation 0 −1 0 1 0 1 0 −1

gender*side*relation 0 −1 0 1 0 0 0 0 0 1 0 −1 0 0 0 0