Chi-Square

Paul J.Lavrakas

doi:10.4135/9781412963947

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Chi-Square

Edited by:
Paul J. Lavrakas
In:Encyclopedia of Survey Research Methods
Chapter DOI:https://doi.org/10.4135/9781412963947.n64
Subject:Survey Research

Request Permissions

Show page numbers Hide page numbers

The chi-square (x2) is a test of significance for categorical variables. Significance tests let the researcher know what the probability is that a given sample estimate actually mirrors the entire population. The chi-square can be used as a goodness-of-fit test, in univariate analysis, or as a test of independence, in bivariate analysis. The latter is the most generally used. In this case, the test measures the significance of the relationship between two categorical variables, representing the first step toward bivariate analysis. For example, if a survey researcher wanted to learn whether gender is associated with an attitude (negative or positive) toward the U.S. involvement in Iraq, chi-square is the simplest significance test to consider to investigate whether or not there are reliable gender-related differences in these attitudes (see Table 1).

The logic behind the chi-square is to calculate the distance between the observed frequencies within the contingency table and the condition of statistical independence (i.e. the hypothesis of no association or “null hypothesis”). The frequencies that Table 1 would contain in case of no association (the so-called expected frequencies) are calculated by dividing the product of the marginal frequencies (row and column) of each cell by the sample size. The greater the distance between the observed frequencies and the expected frequencies, the higher is the chi-square. This is the formula:

where f0 represents the observed frequencies and fe are the expected frequencies. If the value of the chi-square is 0, there is no association between the variables. Unfortunately, the chi-square has no maximum, and this makes its interpretation not intuitive.

In order to interpret the value obtained, the researcher must first calculate the degrees of freedom (df) of the contingency table, multiplying the number of the rows minus 1 by the number of the columns minus 1. Second, given the values of chi-square and df, he or she has to search for the corresponding value of p-level. This value can be located on the chi-square [Page 96]distribution table, usually reported in most handbooks of statistics, or calculated through statistical software such as Statistical Package for the Social Sciences (SPSS) or SAS.

Table 1 Example of contingency table for chi-square analysis (frequency counts)
Support/Oppose U.S. Involvement in Iraq	Female	Male	Total
Support	170	200	370
Oppose	250	150	400
Total	420	350	770

The p-level is the crucial figure to consider when evaluating the test. This is the actual value that indicates the significance of the association. It says, in short, how probable it is that the relationship observed in the survey data is due to mere sampling error. The chi-square test must be used cautiously. First, the researcher should have a probability sample whose size is > 100. Second, since the chi-square statistic is sensitive to the sample size, the researcher cannot compare the chi-square values coming from different samples. Third, researchers should be careful that the expected values in the contingency table are not too small (<5), because the chi-square value will be heavily biased. Finally, sometimes it makes no sense to calculate the chi-square: for example, when the number of categories of both variables is too high.

In all these cases, the chi-square test should not be separated from the detailed inspection of the contingency table and/or the use of more sophisticated measures. Since the chi-square value is not easily interpretable, other measures have been derived from it, like phi-square, Pearson's C, and Cramer's V. They are not influenced by the sample size and, above all, tend to range from 0 to 1 (this maximum, however, is actually achievable only by Cramer's V), measuring the strength of the association, even when this latter is nonlinear.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Chi-Square

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends