src.dackar.anomalies.kernel_two_sample_test¶

Created on July, 2025

@author: wangc, mandd

Attributes¶

logger

Functions¶

`MMD2u`(K, m, n)	Method designed to perform MMD^2_u unbiased statistic using U-statistics.
`MMD2b`(K, m, n)	Method designed to perform MMD^2 biased statistics using V-statistics
`MMD2u_UCB`(K, m[, alpha])	Method designed to calculate the uniform convergence bound for MMD2u
`MMD2b_UCB`(K, m[, alpha])	Method designed to calculate the uniform convergence bound for MMD2b
`compute_null_distribution`(K, m, n[, iterations, ...])	Method designed to calculate the bootstrap null-distribution of MMD2u.
`compute_null_distribution_given_permutations`(K, m, n, ...)	Method designed to calculate the bootstrap null-distribution of MMD2u given predefined permutations.
`kernel_two_sample_test`(X, Y[, kernel_function, ...])	Method designed to calculate MMD^2_u, its null distribution and the p-value of the kernel two-sample test.
`chebyshevTesting`(X, Y[, kernel_function, iterations, ...])	Method designed to perform Chebyshev testing using Chebyshev's inequality
`chebyshevTesting_precomputed_mmd`(mmd2u, mmd2u_null[, ...])	Method designed to perform MMD Chebyshev testing

Module Contents¶

src.dackar.anomalies.kernel_two_sample_test.logger[source]¶

src.dackar.anomalies.kernel_two_sample_test.MMD2u(K, m, n)[source]¶

Method designed to perform MMD^2_u unbiased statistic using U-statistics.

Parameters:

K – np.array, 2-D matrix
m – int, size of first vector
n – int, size of second vector

Returns:

float, MMD^2_u unbiased statistic using U-statistics

Return type:

val

src.dackar.anomalies.kernel_two_sample_test.MMD2b(K, m, n)[source]¶

Method designed to perform MMD^2 biased statistics using V-statistics

Parameters:

K – np.array, 2-D matrix
m – int, size of first vector
n – int, size of second vector

Returns:

float, MMD^2 biased statistics using V-statistics

Return type:

out

src.dackar.anomalies.kernel_two_sample_test.MMD2u_UCB(K, m, alpha=0.05)[source]¶

Method designed to calculate the uniform convergence bound for MMD2u

Parameters:

K – np.array, 2-D matrix
m – int, sample size
alpha – float, acceptance value for hypothesis testing

Returns:

float, uniform convergence bound for MMD2u

Return type:

ucb

src.dackar.anomalies.kernel_two_sample_test.MMD2b_UCB(K, m, alpha=0.05)[source]¶

Method designed to calculate the uniform convergence bound for MMD2b

Parameters:

K – np.array, 2-D matrix
m – int, sample size
alpha – float, acceptance value for hypothesis testing

Returns:

float, uniform convergence bound for MMD2b

Return type:

ucb

src.dackar.anomalies.kernel_two_sample_test.compute_null_distribution(K, m, n, iterations=1000, verbose=False, random_state=None, marker_interval=500)[source]¶

Method designed to calculate the bootstrap null-distribution of MMD2u.

Parameters:

K – np.array, 2-D matrix
m – int, size of first vector
n – int, size of second vector
iterations – int, number of bootstrap iterations
verbose – bool, flag to provide calculation details
random_state – np class, numpy random number generator class
marker_interval – int, interval where calculation details are displayed

Returns:

np.array, null-distribution of MMD2u

Return type:

mmd2u_null

src.dackar.anomalies.kernel_two_sample_test.compute_null_distribution_given_permutations(K, m, n, permutation, iterations=None)[source]¶

Method designed to calculate the bootstrap null-distribution of MMD2u given predefined permutations.

Parameters:

K – np.array, 2-D matrix
m – int, size of first vector
n – int, size of second vector
permutation – np.array, array of permutations
iterations – int, number of bootstrap iterations

Returns:

np.array, null-distribution of MMD2u given predefined permutations

Return type:

mmd2u_null

src.dackar.anomalies.kernel_two_sample_test.kernel_two_sample_test(X, Y, kernel_function='rbf', iterations=2000, verbose=False, random_state=None, alpha=0.05, thin=None, **kwargs)[source]¶

Method designed to calculate MMD^2_u, its null distribution and the p-value of the kernel two-sample test.

Parameters:

X – np.array, first vector
Y – np.array, second vector
kernel_function – string, type of kenerl function. Valid values are: additive_chi2, chi2, linear, poly, polynomial, rbf, laplacian, sigmoid, cosine
iterations – int, number of iterations
verbose – bool, flag to provide calculation details
random_state – np class, numpy random number generator class
alpha – float, acceptance value for hypothesis testing
thin – int, sample size for thinning calculation
**kwargs – dict, dictionary of parameteres that are passed to pairwise_kernels() as kernel parameters. E.g. if kernel_two_sample_test(…, kernel_function=’rbf’, gamma=0.1), then this will result in getting the kernel through kernel_function(metric=’rbf’, gamma=0.1).

Returns:

float, MMD^2_u unbiased statistic using U-statistics mmd2u_null: np.array, null-distribution of MMD2u p_value: float, calculated p-value for hypothesis testing

Return type:

mmd2u

src.dackar.anomalies.kernel_two_sample_test.chebyshevTesting(X, Y, kernel_function='rbf', iterations=2000, verbose=False, random_state=None, alpha=0.01, **kwargs)[source]¶

Method designed to perform Chebyshev testing using Chebyshev’s inequality

Parameters:

X – np.array, first vector
Y – np.array, second vector
kernel_function – string, type of kernel function
iterations – int, number of bootstrap iterations
verbose – bool, flag to provide calculation details
random_state – np class, numpy random number generator class
alpha – float, acceptance value for hypothesis testing
**kwargs – dict, dictionary of parameteres that are passed to pairwise_kernels() as kernel parameters.

Returns:

bool, outcome of Chebyshev testing

Return type:

accept

src.dackar.anomalies.kernel_two_sample_test.chebyshevTesting_precomputed_mmd(mmd2u, mmd2u_null, alpha=0.01)[source]¶

Method designed to perform MMD Chebyshev testing

Parameters:

mmd2u – float, MMD^2_u unbiased statistic using U-statistics
mmd2u_null – np.array, null-distribution of MMD2u
alpha – float, acceptance value for hypothesis testing

Returns:

bool, outcome of Chebyshev testing

Return type:

accept