# The Poisson Regression in R

The Poisson regression model should be used when the dependent (response) variable is in the form of counts or values of the response variables following a Poisson distribution. In R, glm() function can be used to perform Poisson regression analysis.

Note that lm() function is used to perform simple and multiple linear regression models when the dependent variable is continuous. Statistical models such as linear or Poisson regression models can be performed easily in R language.

The Poisson regression is used to analyze count data.

For the Poisson model, let us consider another built-in data set warpbreaks. This data set describes the effect of wool type (A or B) and tension (Low, Medium, and High) on the number of warp breaks per loom, where a loom corresponds to a fixed length of yarn.

The $breaks$ variable is considered a response variable since it contains the number of breaks (count of breaks). The $tension$ and $type$ variables are taken as predictor variables.

pois_mod <- glm(breaks ~ wool + tension, data = warpbreaks, family = poisson)

The output from the pois_mod object is

The glm() provides eight choices for a family with the following default link functions:

The detailed output (estimation and testing of parameters) can be obtained as

summary(pois_mod)

Example:

• A number of cargo ships were damaged by waves (McCullagh & Nelder, 1989).
• Number of deaths due to AIDs in Australia per quarter (3 month periods) from January 1983 – June 1986.
• A number of violent incidents were exhibited over a 6-month period by patients who had been treated in the ER of a psychiatric hospital (Gardner, Mulvey, & Shaw, 1995).
• Daily homicide counts in California (Grogger, 1990).
• Founding of daycare centers in Toronto (Baum & Oliver, 1992).
• Political party-switching among members of the US House of Representatives (King, 1988).
• Number of presidential appointments to the Supreme Court (King, 1987).
• A number of children in a classroom that a child lists as being their friend (unlimited nomination procedure, sociometric data).
• A number of hard disk failures during a year.
• Number of deaths due to SARs (Yu, Chan & Fung, 2006).
• A number of arrests resulted from 911 calls.
• A number of orders of protection were issued.

MCQs in Statistics