VIF and CV calculation
cv_vif.RdThis function provides the values for the Variance Inflation Factor (VIF) and the Coefficient of Variation (CV) for the independent variables (excluding the intercept) in a multiple linear regression model.
Details
It is interesting to note the distinction between essential and non-essential multicollinearity. Essential multicollinearity happens when there is an approximate linear relationship between two or more independent variables (not including the intercept) while non-essential multicollinearity involves a linear relationship between the intercept and at least one independent variable. This distinction matters because the Variance Inflation Factor (VIF) only detects essential multicollinearity, while the Condition Value (CV) is useful for detecting only non-essential multicollinearity. Understanding the distinction between essential and non-essential multicollinearity and the limitations of each detection measure, can be very useful for identifying whether there is a troubling degree of multicollinearity, and determining the kind of multicollinearity present and the variables causing it.
Value
- CV
Coefficient of Variation of each independent variable.
- VIF
Variance Inflation Factor of each independent variable.
References
Salmerón, R., García, C.B. and García, J. (2018). Variance inflation factor and condition number in multiple linear regression. Journal of Statistical Computation and Simulation, 88:2365-2384, doi: https://doi.org/10.1080/00949655.2018.1463376.
Salmerón, R., Rodríguez, A. and García, C.B. (2020). Diagnosis and quantification of the non-essential collinearity. Computational Statistics, 35(2), 647-666, doi: https://doi.org/10.1007/s00180-019-00922-x.
Salmerón, R., García, C.B., Rodríguez, A. and García, C. (2022). Limitations in detecting multicollinearity due to scaling issues in the mcvis package. R Journal, 14(4), 264-279, doi: https://doi.org/10.32614/RJ-2023-010.
Author
R. Salmerón (romansg@ugr.es) and C. García (cbgarcia@ugr.es).
Examples
### Example 1
### At least three independent variables, including the intercept, must be present
head(SLM1, n=5)
#> y1 cte V
#> 1 82.392059 1 19.001420
#> 2 -1.942157 1 -1.733458
#> 3 7.474090 1 1.025146
#> 4 -12.303381 1 -4.445014
#> 5 30.378203 1 6.689864
y = SLM1[,1]
x = SLM1[,2:3]
cv_vif(x)
#> At least 3 independent variables are needed (including the interceptin the first column) to carry out the calculations.
### Example 2
### Creating the design matrix
library(multiColl)
set.seed(2025)
obs = 100
cte = rep(1, obs)
x2 = rnorm(obs, 5, 0.01)
x3 = rnorm(obs, 5, 10)
x4 = x3 + rnorm(obs, 5, 1)
x5 = rnorm(obs, -1, 30)
x = cbind(cte, x2, x3, x4, x5)
cv_vif(x)
#> CV VIF
#> Variable 2 0.002030169 1.025399
#> Variable 3 1.886093419 100.123352
#> Variable 4 0.961634537 100.320601
#> Variable 5 10.725968638 1.025810
### Example 3
### Obtaining the design matrix after executing the command 'lm'
library(multiColl)
set.seed(2025)
obs = 100
cte = rep(1, obs)
x2 = rnorm(obs, 5, 0.01)
x3 = rnorm(obs, 5, 10)
x4 = x3 + rnorm(obs, 5, 1)
x5 = rnorm(obs, -1, 30)
u = rnorm(obs, 0, 2)
y = 5 + 4*x2 - 5*x3 + 2*x4 - x5 + u
reg = lm(y~x2+x3+x4+x5)
x = model.matrix(reg)
cv_vif(x) # identical to Example 2
#> CV VIF
#> Variable 2 0.002030169 1.025399
#> Variable 3 1.886093419 100.123352
#> Variable 4 0.961634537 100.320601
#> Variable 5 10.725968638 1.025810
### Example 3
### Computationally singular system
head(soil, n=5)
#> BaseSat SumCation CECbuffer Ca Mg K Na P Cu Zn
#> 1 2.34 0.1576 0.614 0.0892 0.0328 0.0256 0.010 0.000 0.080 0.184
#> 2 1.64 0.0970 0.516 0.0454 0.0218 0.0198 0.010 0.000 0.064 0.112
#> 3 5.20 0.4520 0.828 0.3306 0.0758 0.0336 0.012 0.240 0.136 0.350
#> 4 4.10 0.3054 0.698 0.2118 0.0536 0.0260 0.014 0.030 0.126 0.364
#> 5 2.70 0.2476 0.858 0.1568 0.0444 0.0304 0.016 0.384 0.078 0.376
#> Mn HumicMatter Density pH ExchAc Diversity
#> 1 3.200 0.1220 0.0822 0.516 0.466 0.2765957
#> 2 2.734 0.0952 0.0850 0.512 0.430 0.2613982
#> 3 4.148 0.1822 0.0746 0.554 0.388 0.2553191
#> 4 3.728 0.1646 0.0756 0.546 0.408 0.2401216
#> 5 4.756 0.2472 0.0692 0.450 0.624 0.1884498
y = soil[,16]
x = soil[,-16]
cv_vif(x)
#> System is computationally singular. Modify the design matrix before running the code.