• No products in the cart.

203.2.5 Multi-collinearity and Individual Impact Of Variables in Logistic Regression

Multicollinearity

In previous section, we studied about Goodness of fit for Logistic Regression

  • When the relation between X and Y is non linear, we use logistic regression
  • The multicollinearity is an issue related to predictor variables. = Multicollinearity need to be fixed in logistic regression as well.
  • Otherwise the individual coefficients of the predictors will be effected by the interdependency
  • The process of identification is same as linear regression

Multicollinearity in R

library(car)
## Warning: package 'car' was built under R version 3.1.3
vif(Fiberbits_model_1)
##                     income          months_on_network 
##                   4.590705                   4.641040 
##             Num_complaints        number_plan_changes 
##                   1.018607                   1.126892 
##                  relocated               monthly_bill 
##                   1.145847                   1.017565 
## technical_issues_per_month          Speed_test_result 
##                   1.020648                   1.206999

Individual Impact of Variables

  • Out of these predictor variables, what are the important variables?
  • If we have to choose the top 5 variables what are they?
  • While selecting the model, we may want to drop few less impacting variables.
  • How to rank the predictor variables in the order of their importance?
  • We can simply look at the z values of the each variable. Look at their absolute values
  • Or calculate the Wald chi-square, which is nearly equal to square of the z-score
  • Wald Chi-Square value helps in ranking the variables

Code-Individual Impact of Variables

library(caret)
## Warning: package 'caret' was built under R version 3.1.3
## Loading required package: lattice
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.1.3
varImp(Fiberbits_model_1, scale = FALSE)
##                             Overall
## income                     20.81981
## months_on_network          28.65421
## Num_complaints             22.81102
## number_plan_changes        24.93955
## relocated                  79.92677
## monthly_bill               13.99490
## technical_issues_per_month 54.58123
## Speed_test_result          93.43471

This will give the absolute value of the Z-score

Model Selection – AIC and BIC

  • AIC and BIC values are like adjusted R-squared values in linear regression
  • Stand-alone model AIC has no real use, but if we are choosing between the models AIC really helps.
  • Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models
  • If we are choosing between two models, a model with less AIC is preferred
  • AIC is an estimate of the information lost when a given model is used to represent the process that generates the data
  • AIC= -2ln(L)+ 2k
  • L be the maximum value of the likelihood function for the model
  • k is the number of independent variables
  • BIC is a substitute to AIC with a slightly different formula. We will follow either AIC or BIC throughout our analysis

Code-AIC and BIC

library(stats)
AIC(Fiberbits_model_1)
## [1] 98377.36
BIC(Fiberbits_model_1)
## [1] 98462.97

The next post is about Model Selection in logistic regression.

DV Analytics

DV Data & Analytics is a leading data science,  Cyber Security training and consulting firm, led by industry experts. We are aiming to train and prepare resources to acquire the most in-demand data science job opportunities in India and abroad.

Bangalore Center

DV Data & Analytics Bangalore Private Limited
#52, 2nd Floor:
Malleshpalya Maruthinagar Bengaluru.
Bangalore 560075
India
(+91) 9019 030 033 (+91) 8095 881 188
Email: info@dvanalyticsmds.com

Bhubneshwar Center

DV Data & Analytics Private Limited Bhubaneswar
Plot No A/7 :
Adjacent to Maharaja Cine Complex, Bhoinagar, Acharya Vihar
Bhubaneswar 751022
(+91) 8095 881 188 (+91) 8249 430 414
Email: info@dvanalyticsmds.com

top
© 2020. All Rights Reserved.