Multiple linear regression fits a response variable as a linear combination of multiple X variables by the method of least squares.

### Assumptions:

- The linear function
Yis the correct model, where_{i}= b_{0}+ b_{1}*X_{1i}+ b_{2}*X_{2i }+ ... + b_{k}*X_{ki}+ e_{i}Yis the_{i}ithobserved value of Y,Xis the_{ji}ithobserved value of thejthX variable, andeis the error term. Equivalently, the expected value of Y for a given value of X is_{i}Y = b. The_{0}+ b_{1}*X_{1}+ b_{2}*X_{2}+ ... + b_{k}*X_{k}interceptisb, the expected value of Y when the value for each X variable is 0._{0}- The X
_{j}variable (predictor variable) values are fixed (i.e., none of the X_{j}is a random variable).- The
eare independent, and identically normally distributed with mean 0 and the same variance._{i}- The Y variable (
response variable) observations are independent.- The variable Y is normally distributed with the same variance as the
e. For a given set of X variable values, the variable Y has constant mean._{i}

The normality assumption is required for hypothesis tests, but not for estimation.

The X variables are also known as the **independent** variables.

The Y variable is also known as the **dependent** variable.

The **coefficients** are **b _{j}**, the amount by which the expected value of Y increases when X

_{j}increases by a unit amount,

*when all the other X variables are held constant*. This interpretation of the coefficients does not hold if some of the X variables are functions of the others, such as an interaction term X

_{j}*X

_{k}.

Note that it is *not* assumed that the X variables are independent of each other.

### Guidance:

Ways to detectbefore performing the multiple linear regression whether your data violate any assumptions.Ways to examinemultiple linear regression results to detect assumption violations.Possible alternativesif your data or multiple linear regression results indicate assumption violations.

To properly analyze and interpret the results of *multiple linear regression*, you should be familiar with the following terms and concepts:

If you are not familiar with these terms and concepts, you are advised to consult with a statistician. Failure to understand and properly apply * multiple linear regression* may result in drawing erroneous conclusions from your data. Additionally, you may want to consult the following references:

- Belsley, David A., Kuh, Edwin, and Welsch, Roy E. 1980.
Regression Diagnostics.New York: John Wiley & Sons.- Brownlee, K. A. 1965.
Statistical Theory and Methodology in Science and Engineering.New York: John Wiley & Sons.- Daniel, Wayne W. 1995.
Biostatistics.6th ed. New York: John Wiley & Sons.- Draper, N. R. and Smith, H. 1981.
Applied Regression Analysis. 2nd ed.New York: John Wiley & Sons.- Hoaglin, D. C., Mosteller, F., and Tukey, J. W. 1985.
Exploring Data Tables, Trends, and Shapes.New York: John Wiley & Sons.- Neter, J., Kutner, M.H., Nachtsheim, C.J., and Wasserman, W. 1996.
Applied Linear Regression Models.3rd ed. Chicago: Irwin.- Neter, J., Wasserman, W., and Kutner, M.H. 1990.
Applied Linear Statistical Models. 3rd ed.Homewood, IL: Irwin.- Rosner, Bernard. 1995.
Fundamentals of Biostatistics.4th ed. Belmont, California: Duxbury Press.- Sokal, Robert R. and Rohlf, F. James. 1995.
Biometry.3rd. ed. New York: W. H. Freeman and Co.- Zar, Jerrold H. 1996.
Biostatistical Analysis.3rd ed. Upper Saddle River, NJ: Prentice-Hall.

Glossary | StatGuide Home | Home