Logistic Regression Models in R

The article is about the use and application of Logistic Regression Models in R Language. In logistic regression models, the response variable ($y$) is of categorical (binary, dichotomous) values such as 1 or 0 (TRUE/ FALSE). It measures the probability of a binary response variable based on a mathematical equation relating the values of the response variable with the predictor(s). The built-in glm() function in R can be used to perform logistic regression analysis.

Probability and Odds Ratio

The odds are used in logistic regression. If $p$ is the probability of success, the odds of in favour of success are, $\frac{p}{q}=\frac{p}{1-p}$.

Note that probability can be converted to odds and odds can also be converted to likelihood (probability). However, unlike probability, odds can exceed 1. For example, if the likelihood of an event is 0.25, the odds in favour of that event are $\frac{0.25}{0.75}=0.33$. And the odds against the same event are $\frac{0.75}{0.25}=3$.

Logistic Regression Models in R (Example)

In built-in dataset (“mtcars“), the column (am) describes the transmission mode (automatic or manual) which is of binary value (0 or 1). Let us perform logistic regression models between the response variable “am” and other regressors: “hp”, “wt”, and “cyl” as given:

Logistic Regression with one Dichotomous Predictor

logmodel1 <- glm(am ~ vs, family = "binomial")
summary(logmodel1)

Logistic Regression with one Continuous Predictor

If the prediction variable is continuous then the logistic regression formula in R would be as given below:

logmodel2 <- glm(am ~ wt, family = "binomial")
summary(logmodel2)

Multiple Predictors in Logistic Regression

The following is an example of a logistic regression model with more than one predictor. For the model diagnostic plots are also drawn.

logmodel3 <- glm(am ~ cyl + hp + wt, family = "binomial")
summary(logmodel3)
plot(logmodel3)

Note: in the logistic regression model, dichotomous and continuous variables can be used as predictors.

Logistic Regression Models in R
Logistic Regression Models in R and Diagnostic Plots

In R language, the coefficients returned by logistic regression are a logit, or the log of the odds. To convert logits to odds ratio exponentiates it and to convert logits to probability use $\frac{e^\beta}{1-e^\beta}$. For example,

logmodel1 <- glm(am ~ vs, family = "binomial", data = mtcars)
logit_coef <- logmodel1$coef
exp(logmodel1$coef)
exp(logit_coef)/(1 + exp(logmodel1$coef))
Logistic Regression in R

Leave a Reply

Discover more from R Language Frequently Asked Questions

Subscribe now to keep reading and get access to the full archive.

Continue reading