The generalized linear models (GLM) can be used when the distribution of the response variable is non-normal or when the response variable is transformed into linearity. The GLMs are flexible extensions of linear models that are used to fit the regression models to non-Gaussian data.
The basic form of a Generalized linear model is
\begin{align*}
g(\mu_i) &= X_i’ \beta \\
&= \beta_0 + \sum\limits_{j=1}^p x_{ij} \beta_j
\end{align*}
where $\mu_i=E(U_i)$ is the expected value of the response variable $Y_i$ given the predictors, $g(\cdot)$ is a smooth and monotonic link function that connects $\mu_i$ to the predictors, $X_i’=(x_{i0}, x_{i1}, \cdots, x_{ip})$ is the known vector having $i$th observations with $x_{i0}=1$, and $\beta=(\beta_0, \beta_1, \cdots, \beta_p)’$ is the unknown vector of regression coefficients.
The glm()
is a function that can be used to fit a generalized linear model, using the form
mod <- glm(formula, family = gaussian, data = data.frame)
The family
argument is a description of the error distribution and link function to be used in the model.
The class of generalized linear models is specified by giving a symbolic description of the linear predictor and a description of the error distribution.
Family Name | Link Functions |
---|---|
binomial | logit , probit , cloglog |
gaussian | identity , log , inverse |
Gamma | identity , inverse , log |
inverse gaussian | $1/ \mu^2$, identity , inverse ,log |
poisson | logit , probit , cloglog , identity , inverse |
quasi | log , $1/ \mu^2$, sqrt |
Generalized Linear Model Example in R
Consider the “cars” dataset available in R.
data(cars) head(cars) attach(cars) scatter.smooth(x=speed, y=dist, main = "Dist ~ Speed") # Linear Model lm(dist ~ speed, data = cars) summary(lm(dist ~ speed, data = cars) # Generalized Linear Model glm(dist ~ speed, data=cars, family = "binomial") plot(glm(dist ~ speed, data = cars)) summary(glm(dist ~ speed, data = cars))