Kendall Correlation — kendall_cor • kendallknight

kendall_cor() calculates the Kendall correlation coefficient between two numeric vectors. It uses the algorithm described in Knight (1966), which is based on the number of concordant and discordant pairs. The computational complexity of the algorithm is \(O(n \log(n))\), which is faster than the base R implementation in stats::cor(..., method = "kendall") that has a computational complexity of \(O(n^2)\). For small vectors (i.e., less than 100 observations), the time difference is negligible. However, for larger vectors, the difference can be substantial.

By construction, the implementation drops missing values on a pairwise basis. This is the same as using stats::cor(..., use = "pairwise.complete.obs").

kendall_cor(x, y = NULL)

Arguments

x: a numeric vector or matrix.
y: an optional numeric vector.

Value

A numeric value between -1 and 1.

References

Knight, W. R. (1966). "A Computer Method for Calculating Kendall's Tau with Ungrouped Data". Journal of the American Statistical Association, 61(314), 436–439.

Abrevaya J. (1999). Computation of the Maximum Rank Correlation Estimator. Economic Letters 62, 279-285.

Christensen D. (2005). Fast algorithms for the calculation of Kendall's Tau. Journal of Computational Statistics 20, 51-62.

Emara (2024). Khufu: Object-Oriented Programming using C++

Examples

# input vectors -> scalar output
x <- c(1, 0, 2)
y <- c(5, 3, 4)
kendall_cor(x, y)
#> [1] 0.3333333

# input matrix -> matrix output
x <- mtcars[, c("mpg", "cyl")]
kendall_cor(x)
#>            [,1]       [,2]
#> [1,]  1.0000000 -0.7953134
#> [2,] -0.7953134  1.0000000