Performs a Chi-squared test for multivariate outliers.

moutlier_chisq(
  xs,
  mask = !Reduce("|", lapply(xs, is.na)),
  threshold = c(0.9, 0.95),
  return.score = FALSE
)

Arguments

xs

A dataframe or list of vectors (which will be coerced to a numeric matrix).

mask

A logical vector that defines which values in x will used when computing statistics. Useful when a subset of quality-assured data is available. Default mask is non-NA Values.

threshold

A length-two vector identifying thresholds for "mild" and "extreme" outliers.

return.score

if TRUE, return the numeric outlier score. If FALSE, return an ordered factor classifying the observations as one of "not outlier" (1), "mild outlier" (2), or "extreme outlier" (3).

Examples

x = seq(0, 34, by = 0.25)*pi noise = rnorm(length(x), mean = 1, sd = 3) y = sin(x) + noise mask = noise < 1 moutlier_chisq(list(x, y))
#> [1] not outlier not outlier not outlier not outlier #> [5] not outlier not outlier not outlier not outlier #> [9] not outlier not outlier not outlier not outlier #> [13] not outlier not outlier not outlier not outlier #> [17] not outlier not outlier not outlier not outlier #> [21] extreme outlier not outlier not outlier not outlier #> [25] not outlier not outlier not outlier not outlier #> [29] not outlier not outlier not outlier not outlier #> [33] not outlier not outlier not outlier not outlier #> [37] extreme outlier not outlier not outlier not outlier #> [41] not outlier not outlier not outlier not outlier #> [45] not outlier not outlier not outlier not outlier #> [49] not outlier mild outlier not outlier not outlier #> [53] not outlier not outlier not outlier not outlier #> [57] not outlier not outlier not outlier not outlier #> [61] not outlier not outlier not outlier not outlier #> [65] not outlier not outlier not outlier not outlier #> [69] not outlier not outlier not outlier not outlier #> [73] not outlier mild outlier not outlier not outlier #> [77] not outlier not outlier not outlier not outlier #> [81] not outlier not outlier not outlier not outlier #> [85] not outlier not outlier not outlier not outlier #> [89] not outlier not outlier not outlier mild outlier #> [93] not outlier not outlier not outlier not outlier #> [97] not outlier not outlier not outlier not outlier #> [101] not outlier not outlier not outlier not outlier #> [105] not outlier not outlier not outlier not outlier #> [109] not outlier not outlier not outlier not outlier #> [113] extreme outlier not outlier not outlier not outlier #> [117] not outlier not outlier not outlier not outlier #> [121] extreme outlier not outlier not outlier not outlier #> [125] not outlier not outlier not outlier not outlier #> [129] not outlier mild outlier not outlier mild outlier #> [133] not outlier not outlier not outlier not outlier #> [137] not outlier #> Levels: not outlier < not evaluated < mild outlier < extreme outlier
moutlier_chisq(list(x, y), mask)
#> [1] not outlier not outlier not outlier not outlier #> [5] not outlier not outlier not outlier not outlier #> [9] not outlier not outlier not outlier not outlier #> [13] not outlier not outlier not outlier not outlier #> [17] not outlier not outlier not outlier not outlier #> [21] extreme outlier not outlier not outlier not outlier #> [25] not outlier not outlier not outlier not outlier #> [29] not outlier not outlier not outlier not outlier #> [33] not outlier not outlier not outlier not outlier #> [37] extreme outlier not outlier not outlier not outlier #> [41] not outlier not outlier not outlier not outlier #> [45] not outlier not outlier not outlier not outlier #> [49] not outlier mild outlier not outlier not outlier #> [53] not outlier not outlier not outlier not outlier #> [57] not outlier not outlier not outlier not outlier #> [61] not outlier not outlier not outlier not outlier #> [65] not outlier not outlier not outlier not outlier #> [69] not outlier not outlier not outlier not outlier #> [73] not outlier mild outlier not outlier not outlier #> [77] not outlier not outlier not outlier not outlier #> [81] not outlier not outlier not outlier not outlier #> [85] not outlier not outlier not outlier not outlier #> [89] not outlier not outlier not outlier mild outlier #> [93] not outlier not outlier not outlier not outlier #> [97] not outlier not outlier not outlier not outlier #> [101] not outlier not outlier not outlier not outlier #> [105] not outlier not outlier not outlier not outlier #> [109] not outlier not outlier not outlier not outlier #> [113] extreme outlier not outlier not outlier not outlier #> [117] not outlier not outlier not outlier not outlier #> [121] extreme outlier not outlier not outlier not outlier #> [125] not outlier not outlier not outlier not outlier #> [129] not outlier mild outlier not outlier mild outlier #> [133] not outlier not outlier not outlier not outlier #> [137] not outlier #> Levels: not outlier < not evaluated < mild outlier < extreme outlier
moutlier_chisq(list(x, y), mask, threshold = c(0.8, 0.9))
#> [1] not outlier not outlier mild outlier not outlier #> [5] not outlier mild outlier not outlier not outlier #> [9] mild outlier not outlier not outlier not outlier #> [13] not outlier mild outlier not outlier not outlier #> [17] not outlier not outlier mild outlier not outlier #> [21] extreme outlier not outlier not outlier not outlier #> [25] not outlier not outlier not outlier not outlier #> [29] not outlier not outlier not outlier not outlier #> [33] not outlier not outlier not outlier not outlier #> [37] extreme outlier not outlier not outlier not outlier #> [41] not outlier not outlier not outlier not outlier #> [45] not outlier not outlier not outlier not outlier #> [49] not outlier extreme outlier not outlier not outlier #> [53] not outlier not outlier not outlier not outlier #> [57] not outlier not outlier not outlier not outlier #> [61] not outlier not outlier not outlier not outlier #> [65] not outlier not outlier not outlier mild outlier #> [69] not outlier not outlier not outlier mild outlier #> [73] not outlier extreme outlier not outlier not outlier #> [77] not outlier not outlier not outlier not outlier #> [81] not outlier not outlier not outlier not outlier #> [85] not outlier not outlier not outlier not outlier #> [89] not outlier not outlier not outlier extreme outlier #> [93] not outlier not outlier not outlier not outlier #> [97] not outlier not outlier not outlier mild outlier #> [101] not outlier not outlier not outlier not outlier #> [105] not outlier not outlier not outlier not outlier #> [109] mild outlier not outlier not outlier not outlier #> [113] extreme outlier not outlier not outlier not outlier #> [117] not outlier not outlier not outlier mild outlier #> [121] extreme outlier not outlier not outlier not outlier #> [125] not outlier mild outlier not outlier not outlier #> [129] mild outlier extreme outlier not outlier extreme outlier #> [133] mild outlier mild outlier mild outlier not outlier #> [137] mild outlier #> Levels: not outlier < not evaluated < mild outlier < extreme outlier
moutlier_chisq(list(x, y), return.score = TRUE)
#> [1] 3.02694763 2.91034863 4.02214462 2.77346595 2.68979604 4.37342642 #> [7] 2.43989889 2.72083191 4.13795807 2.45795094 3.12030856 2.33283268 #> [13] 2.02635122 4.06770990 1.99873015 1.98703034 2.22154927 3.19092492 #> [19] 3.82609483 2.54844962 8.70872590 1.55437810 1.91582939 1.46865677 #> [25] 2.02526481 2.53966886 1.30893085 2.09981002 2.13733894 1.87735085 #> [31] 1.17118268 0.97379820 1.16596209 1.45977594 0.82192557 0.80130824 #> [37] 8.51565378 1.87015835 1.75486445 1.28624630 0.95869354 0.51451976 #> [43] 0.86486315 0.97861012 0.98323886 0.52751322 1.12698747 0.28185258 #> [49] 1.70038687 4.79594641 0.30660685 3.03924590 0.48254219 0.29221048 #> [55] 1.03837716 0.48343720 0.56893269 0.97831878 0.14479999 0.27348757 #> [61] 0.81379146 2.79682815 0.03389963 0.02908691 0.53193348 0.19998991 #> [67] 0.01195462 3.99832444 0.30286383 0.21118540 0.15016959 3.47495234 #> [73] 2.43615211 5.43140381 0.37590634 0.16220959 0.39064781 0.23218033 #> [79] 0.35089085 1.00780412 1.64352606 0.32023773 0.15513807 1.57243951 #> [85] 0.35592936 3.16132897 1.42604593 0.25048538 0.36640665 0.30273626 #> [91] 0.45656538 4.91280485 0.46692312 0.46830655 2.03441275 0.94158895 #> [97] 2.29051436 0.60108145 0.57174464 4.37354567 0.67783555 0.70693179 #> [103] 1.27863043 1.44978886 0.82658048 0.89250961 1.41367746 2.34808558 #> [109] 4.10638702 1.10078042 1.35455377 1.31191974 6.02502419 1.29208294 #> [115] 1.69406160 1.67882090 1.92493265 2.23589086 1.59706791 3.26923085 #> [121] 6.90368471 2.90880006 1.86284299 2.17382604 1.99869738 3.82988329 #> [127] 3.11620850 2.63925687 3.76616439 5.07189240 3.17716610 5.01900923 #> [133] 3.47585801 3.45902512 4.22486617 2.85088780 4.05305047