The generalized linear models (GLM) can be used when the distribution of the response variable is non-normal or when the response variable is transformed into linearity. The GLMs are flexible extensions of linear models that are used to fit the regression models to non-Gaussian data.

The basic form of a Generalized linear model is

\begin{align*}

g(\mu_i) &= X_i’ \beta \\

&= \beta_0 + \sum\limits_{j=1}^p x_{ij} \beta_j

\end{align*}

where $\mu_i=E(U_i)$ is the expected value of the response variable $Y_i$ given the predictors, $g(\cdot)$ is a smooth and monotonic link function that connects $\mu_i$ to the predictors, $X_i’=(x_{i0}, x_{i1}, \cdots, x_{ip})$ is the known vector having $i$th observations with $x_{i0}=1$, and $\beta=(\beta_0, \beta_1, \cdots, \beta_p)’$ is the unknown vector of regression coefficients.

The `glm()`

is a function that can be used to fit a generalized linear model, using the form

mod <- glm(formula, family = gaussian, data = data.frame)

The `family`

argument is a description of the error distribution and link function to be used in the model.

The class of generalized linear models is specified by giving a symbolic description of the linear predictor and a description of the error distribution.

Family Name | Link Functions |
---|---|

`binomial` | `logit ` , `probit` , `cloglog` |

`gaussian` | `identity` , `log` , `inverse` |

`Gamma` | `identity` , `inverse` , `log` |

`inverse gaussian` | $1/ \mu^2$, `identity` , `inverse` ,`log` |

`poisson` | `logit` , `probit` , `cloglog` , `identity` , `inverse` |

`quasi` | `log` , $1/ \mu^2$, `sqrt` |

### Generalized Linear Model Example in R

Consider the “cars” dataset available in R.

data(cars) head(cars) attach(cars) scatter.smooth(x=speed, y=dist, main = "Dist ~ Speed") # Linear Model lm(dist ~ speed, data = cars) summary(lm(dist ~ speed, data = cars) # Generalized Linear Model glm(dist ~ speed, data=cars, family = "binomial") plot(glm(dist ~ speed, data = cars)) summary(glm(dist ~ speed, data = cars))