Skip to main content icon/video/no-internet

Cause and effect between two phenomena in a real-life setting cannot be judged or resolved without configuring their patterns of occurrence, correlation, and uncertainties. Item response theory has been well developed by psychologists to explain the personal ability level of the examinees who answer a series of questions varying in toughness level. The ability and toughness levels are correlated random variables with a degree of dependence between them. Another revealing example is that the safety of a building cannot be ascertained without knowledge of both stress and strength of the materials used in the building. When the strength, Y, exceeds stress level, X, safety is guaranteed. Another serious life-and-death example is the distance of a populous city from the geological epicenter of an earthquake and the severity of damage in the city, as experienced in the tsunami of December 26, 2004, in south Asia. In a health application, X and Y are “susceptibility” and “immunity” levels of a person in the outbreak of a disease epidemic, respectively, and a person is healthy only so long as Y exceeds X. Identification of underlying bivariate probability distribution in these and other applications reveals a volume of related knowledge.

Such two stochastic aspects x and y in experimental or randomly observed studies are well explained by employing an appropriate underlying joint (probability) distribution f(x, y). Their patterns of occurrence, correlation, and the prediction of one aspect using the occurrence level of another aspect are feasible from randomly collected bivariate data. Though the domain for the data could be a shrunken version depending on the cases, it is in general from minus infinity to positive infinity. A truncated or censored version of the bivariate (probability) distributions might be employed in such scenarios. A bivariate distribution could be a count of continuous or mixed type. However, their conditional f(x | Y = y) and f(y | X = x) distributions reveal interinfluence by one on another, but their marginal distribution f(x) or f(y) does not. For example, the predicted value of Y for a given level X = x is called a regression function of x. The conditional and marginal dispersions obey an inequality Var(Y | X = x) ≤ Var(Y), which means that the conditional prediction of Y with knowledge of X is more precise than unconditional prediction of Y. The inverse of variance is called precision. Also, the so-called product moment is built in a hierarchical manner in accordance with the result E(YX) = E[E(Y | X)], where the outer expectation is with respect to the random variable, X. Their covariance is defined to be cov(Y, X) = E[E(Y | X = x]–E(Y) E(X). The covariance is scale oriented, and it could be misleading unless caution is exercised. Furthermore, the variance can also be hierarchically constructed according to a result Var[Y] = E[Var(Y | X = x] + Var[EY | X = x)].

As done in univariate cases, the moment, cumulant, and probability generating functions are derived and used to identify central and noncentral moments and cumulants, along with their properties in bivariate distributions. The correlation coefficient ρY,X between designated dependent variable Y and chosen independent (more often called predictor) variable X is cov(Y, X)/σYσX, where σY =√var(Y) and σX = √var(X) are standard deviations. The correlation coefficient is scale free. A simple linear regression function is Y = β0 + β1x + ∊ for predicting Y at a selected level X = x, and the so-called regression parameter (slope) is β = ρσy/σx. See a variety of popular bivariate distributions in Tables 1 and 2.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading