### Introduction to lm Function in R

Many generic functions are available for the computation of regression coefficients, for example, testing the coefficients, computing the residuals, prediction values, etc. Therefore, a good grasp of the `lm()`

function is necessary. It is assumed that you are aware of performing the regression analysis using the lm function.

mod <- lm(mpg ~ hp, data = mtcars)

## Table of Contents

To learn about performing linear regression analysis using the lm function you can visit the article “Performing Linear Regression in R“

### Objects of “lm” Class

The object returned by the lm() function has a class of “lm”. The objects associated with the “lm” class have mode as a list.

class(mod)

The name of the objects related to the “lm” class can be queried via

names(mod)

All the components of the “lm” class can be assessed directly. For example,

mod$rank mod$coef # or mod$coefficients

### Generic Functions of “lm” model

The following is the list of some generic functions for the fitted “lm” model.

Generic Function | Short Description |
---|---|

`print()` | print or display the results in the R Console |

`summary()` | print or displays regression coefficients, their standard errors, t-ratios, p-values, and significance |

`coef()` | extracts regression coefficients |

`residuals()` | or `resid()` : extracts residuals of the fitted model |

`fitted()` | or `fitted.values()` : extracts fitted values |

`anova()` | perform comparisons of the nested model |

`predict()` | compute predicted values for new data |

`plot()` | draw a diagnostics plot of the regression model |

`confint()` | compute the confidence intervals for regression coefficients |

`deviance()` | compute the residual sum of squares |

`vcov()` | compute estimated variance-covariance matrix |

`logLik()` | compute the log-likelihood |

`AIC(), BIC()` | compute information criteria |

It is better to save objects from the `summary()`

function.

The summary() function returns an object of class “`summy.lm()`

” and its components can be queried via

sum_mod <- summary(mod) names(sum_mod) names( summary(mod) )

The objects from the `summary()`

function can be obtained as

sum_mod$residuals sum_mod$r.squared sum_mod$adj.r.squared sum_mod$df sum_mod$sigma sum_mod$fstatistic

### Computation and Visualization of Prediction and Confidence Interval

The confidence interval for estimated coefficients can be computed as

confint(mod, level = 0.95)

Note that level argument is optional if the confidence level is 95% (significance level is 5%).

The prediction intervals for mean and individual for `hp`

(regressor) equal to 200 and 160, can be computed as

predict(mod, newdata=data.frame(hp = c(200, 160)), interval = "confidence" ) predict(mod, newdata=data.frame(hp = c(200, 160)), interval = "prediction" )

The prediction intervals can be used for computing and visualizing confidence bands. For example,

x = seq(50, 350, length = 32 ) pred <- predict(mod, newdata=data.frame(x), interval = "prediction" ) plot(hp, mpg) lines(pred[,1] ~ x, col = 1) # fitted values lines(pred[,2] ~ x, col = 2) # lower limit lines(pred[,3] ~ x, col = 2) # upper limit

### Regression Diagnostics

For diagnostics plot, the plot() function can be used and it provides four graphs of

- residuals vs fitted values
- QQ plot of standardized residuals
- scale-location plot of fitted values against the square root of standardized residuals
- standardized residuals vs leverage

To plot say QQ plot only use

plot(mod, which = 2)

which argument is used to select the graph produced out of four.