Introduction to lm Function in R
Many generic functions are available for the computation of regression coefficients, for example, testing the coefficients, computing the residuals, prediction values, etc. Therefore, a good grasp of the lm()
function is necessary. It is assumed that you are aware of performing the regression analysis using the lm function.
mod <- lm(mpg ~ hp, data = mtcars)
Table of Contents
To learn about performing linear regression analysis using the lm function you can visit the article “Performing Linear Regression in R“
Objects of “lm” Class
The object returned by the lm() function has a class of “lm”. The objects associated with the “lm” class have mode as a list.
class(mod)
The name of the objects related to the “lm” class can be queried via
names(mod)
All the components of the “lm” class can be assessed directly. For example,
mod$rank mod$coef # or mod$coefficients
Generic Functions of “lm” model
The following is the list of some generic functions for the fitted “lm” model.
Generic Function | Short Description |
---|---|
print() | print or display the results in the R Console |
summary() | print or displays regression coefficients, their standard errors, t-ratios, p-values, and significance |
coef() | extracts regression coefficients |
residuals() | or resid() : extracts residuals of the fitted model |
fitted() | or fitted.values() : extracts fitted values |
anova() | perform comparisons of the nested model |
predict() | compute predicted values for new data |
plot() | draw a diagnostics plot of the regression model |
confint() | compute the confidence intervals for regression coefficients |
deviance() | compute the residual sum of squares |
vcov() | compute estimated variance-covariance matrix |
logLik() | compute the log-likelihood |
AIC(), BIC() | compute information criteria |
It is better to save objects from the summary()
function.
The summary() function returns an object of class “summy.lm()
” and its components can be queried via
sum_mod <- summary(mod) names(sum_mod) names( summary(mod) )
The objects from the summary()
function can be obtained as
sum_mod$residuals sum_mod$r.squared sum_mod$adj.r.squared sum_mod$df sum_mod$sigma sum_mod$fstatistic
Computation and Visualization of Prediction and Confidence Interval
The confidence interval for estimated coefficients can be computed as
confint(mod, level = 0.95)
Note that level argument is optional if the confidence level is 95% (significance level is 5%).
The prediction intervals for mean and individual for hp
(regressor) equal to 200 and 160, can be computed as
predict(mod, newdata=data.frame(hp = c(200, 160)), interval = "confidence" ) predict(mod, newdata=data.frame(hp = c(200, 160)), interval = "prediction" )
The prediction intervals can be used for computing and visualizing confidence bands. For example,
x = seq(50, 350, length = 32 ) pred <- predict(mod, newdata=data.frame(x), interval = "prediction" ) plot(hp, mpg) lines(pred[,1] ~ x, col = 1) # fitted values lines(pred[,2] ~ x, col = 2) # lower limit lines(pred[,3] ~ x, col = 2) # upper limit
Regression Diagnostics
For diagnostics plot, the plot() function can be used and it provides four graphs of
- residuals vs fitted values
- QQ plot of standardized residuals
- scale-location plot of fitted values against the square root of standardized residuals
- standardized residuals vs leverage
To plot say QQ plot only use
plot(mod, which = 2)
which argument is used to select the graph produced out of four.