Skip to contents

This function provides the values for the Variance Inflation Factor (VIF) and the Coefficient of Variation (CV) for the independent variables (excluding the intercept) in a multiple linear regression model.

Usage

cv_vif(x, tol = 1e-30)

Arguments

x

A numerical design matrix containing more than one regressor, including the intercept in the first column.

tol

A real number that indicates the tolerance beyond which the system is considered computationally unique when calculating the VIF. The default value is tol=1e-30.

Details

It is interesting to note the distinction between essential and non-essential multicollinearity. Essential multicollinearity happens when there is an approximate linear relationship between two or more independent variables (not including the intercept) while non-essential multicollinearity involves a linear relationship between the intercept and at least one independent variable. This distinction matters because the Variance Inflation Factor (VIF) only detects essential multicollinearity, while the Condition Value (CV) is useful for detecting only non-essential multicollinearity. Understanding the distinction between essential and non-essential multicollinearity and the limitations of each detection measure, can be very useful for identifying whether there is a troubling degree of multicollinearity, and determining the kind of multicollinearity present and the variables causing it.

Value

CV

Coefficient of Variation of each independent variable.

VIF

Variance Inflation Factor of each independent variable.

References

Salmerón, R., García, C.B. and García, J. (2018). Variance inflation factor and condition number in multiple linear regression. Journal of Statistical Computation and Simulation, 88:2365-2384, doi: https://doi.org/10.1080/00949655.2018.1463376.

Salmerón, R., Rodríguez, A. and García, C.B. (2020). Diagnosis and quantification of the non-essential collinearity. Computational Statistics, 35(2), 647-666, doi: https://doi.org/10.1007/s00180-019-00922-x.

Salmerón, R., García, C.B., Rodríguez, A. and García, C. (2022). Limitations in detecting multicollinearity due to scaling issues in the mcvis package. R Journal, 14(4), 264-279, doi: https://doi.org/10.32614/RJ-2023-010.

Author

R. Salmerón (romansg@ugr.es) and C. García (cbgarcia@ugr.es).

See also

Examples

### Example 1 
### At least three independent variables, including the intercept, must be present

  head(SLM1, n=5)
#>           y1 cte         V
#> 1  82.392059   1 19.001420
#> 2  -1.942157   1 -1.733458
#> 3   7.474090   1  1.025146
#> 4 -12.303381   1 -4.445014
#> 5  30.378203   1  6.689864
  y = SLM1[,1]
  x = SLM1[,2:3]
  cv_vif(x)
#> At least 3 independent variables are needed (including the interceptin the first column) to carry out the calculations.

### Example 2
### Creating the design matrix

  library(multiColl)
  set.seed(2025)
  obs = 100
  cte = rep(1, obs)
  x2 = rnorm(obs, 5, 0.01)
  x3 = rnorm(obs, 5, 10)
  x4 = x3 + rnorm(obs, 5, 1)
  x5 = rnorm(obs, -1, 30)
  x = cbind(cte, x2, x3, x4, x5)
  cv_vif(x)
#>                      CV        VIF
#> Variable 2  0.002030169   1.025399
#> Variable 3  1.886093419 100.123352
#> Variable 4  0.961634537 100.320601
#> Variable 5 10.725968638   1.025810

### Example 3 
### Obtaining the design matrix after executing the command 'lm'

  library(multiColl)
  set.seed(2025)
  obs = 100
  cte = rep(1, obs)
  x2 = rnorm(obs, 5, 0.01)
  x3 = rnorm(obs, 5, 10)
  x4 = x3 + rnorm(obs, 5, 1)
  x5 = rnorm(obs, -1, 30)
  u = rnorm(obs, 0, 2)
  y = 5 + 4*x2 - 5*x3 + 2*x4 - x5 + u
  reg = lm(y~x2+x3+x4+x5)
  x = model.matrix(reg)
  cv_vif(x) # identical to Example 2
#>                      CV        VIF
#> Variable 2  0.002030169   1.025399
#> Variable 3  1.886093419 100.123352
#> Variable 4  0.961634537 100.320601
#> Variable 5 10.725968638   1.025810

### Example 3 
### Computationally singular system

  head(soil, n=5)
#>   BaseSat SumCation CECbuffer     Ca     Mg      K    Na     P    Cu    Zn
#> 1    2.34    0.1576     0.614 0.0892 0.0328 0.0256 0.010 0.000 0.080 0.184
#> 2    1.64    0.0970     0.516 0.0454 0.0218 0.0198 0.010 0.000 0.064 0.112
#> 3    5.20    0.4520     0.828 0.3306 0.0758 0.0336 0.012 0.240 0.136 0.350
#> 4    4.10    0.3054     0.698 0.2118 0.0536 0.0260 0.014 0.030 0.126 0.364
#> 5    2.70    0.2476     0.858 0.1568 0.0444 0.0304 0.016 0.384 0.078 0.376
#>      Mn HumicMatter Density    pH ExchAc Diversity
#> 1 3.200      0.1220  0.0822 0.516  0.466 0.2765957
#> 2 2.734      0.0952  0.0850 0.512  0.430 0.2613982
#> 3 4.148      0.1822  0.0746 0.554  0.388 0.2553191
#> 4 3.728      0.1646  0.0756 0.546  0.408 0.2401216
#> 5 4.756      0.2472  0.0692 0.450  0.624 0.1884498
  y = soil[,16]
  x = soil[,-16]
  cv_vif(x)
#> System is computationally singular. Modify the design matrix before running the code.