src.dackar.anomalies.kernel_two_sample_test

Created on July, 2025

@author: wangc, mandd

Attributes

logger

Functions

MMD2u(K, m, n)

Method designed to perform MMD^2_u unbiased statistic using U-statistics.

MMD2b(K, m, n)

Method designed to perform MMD^2 biased statistics using V-statistics

MMD2u_UCB(K, m[, alpha])

Method designed to calculate the uniform convergence bound for MMD2u

MMD2b_UCB(K, m[, alpha])

Method designed to calculate the uniform convergence bound for MMD2b

compute_null_distribution(K, m, n[, iterations, ...])

Method designed to calculate the bootstrap null-distribution of MMD2u.

compute_null_distribution_given_permutations(K, m, n, ...)

Method designed to calculate the bootstrap null-distribution of MMD2u given predefined permutations.

kernel_two_sample_test(X, Y[, kernel_function, ...])

Method designed to calculate MMD^2_u, its null distribution and the p-value of the kernel two-sample test.

chebyshevTesting(X, Y[, kernel_function, iterations, ...])

Method designed to perform Chebyshev testing using Chebyshev's inequality

chebyshevTesting_precomputed_mmd(mmd2u, mmd2u_null[, ...])

Method designed to perform MMD Chebyshev testing

Module Contents

src.dackar.anomalies.kernel_two_sample_test.logger[source]
src.dackar.anomalies.kernel_two_sample_test.MMD2u(K, m, n)[source]

Method designed to perform MMD^2_u unbiased statistic using U-statistics.

Parameters:
  • K – np.array, 2-D matrix

  • m – int, size of first vector

  • n – int, size of second vector

Returns:

float, MMD^2_u unbiased statistic using U-statistics

Return type:

val

src.dackar.anomalies.kernel_two_sample_test.MMD2b(K, m, n)[source]

Method designed to perform MMD^2 biased statistics using V-statistics

Parameters:
  • K – np.array, 2-D matrix

  • m – int, size of first vector

  • n – int, size of second vector

Returns:

float, MMD^2 biased statistics using V-statistics

Return type:

out

src.dackar.anomalies.kernel_two_sample_test.MMD2u_UCB(K, m, alpha=0.05)[source]

Method designed to calculate the uniform convergence bound for MMD2u

Parameters:
  • K – np.array, 2-D matrix

  • m – int, sample size

  • alpha – float, acceptance value for hypothesis testing

Returns:

float, uniform convergence bound for MMD2u

Return type:

ucb

src.dackar.anomalies.kernel_two_sample_test.MMD2b_UCB(K, m, alpha=0.05)[source]

Method designed to calculate the uniform convergence bound for MMD2b

Parameters:
  • K – np.array, 2-D matrix

  • m – int, sample size

  • alpha – float, acceptance value for hypothesis testing

Returns:

float, uniform convergence bound for MMD2b

Return type:

ucb

src.dackar.anomalies.kernel_two_sample_test.compute_null_distribution(K, m, n, iterations=1000, verbose=False, random_state=None, marker_interval=500)[source]

Method designed to calculate the bootstrap null-distribution of MMD2u.

Parameters:
  • K – np.array, 2-D matrix

  • m – int, size of first vector

  • n – int, size of second vector

  • iterations – int, number of bootstrap iterations

  • verbose – bool, flag to provide calculation details

  • random_state – np class, numpy random number generator class

  • marker_interval – int, interval where calculation details are displayed

Returns:

np.array, null-distribution of MMD2u

Return type:

mmd2u_null

src.dackar.anomalies.kernel_two_sample_test.compute_null_distribution_given_permutations(K, m, n, permutation, iterations=None)[source]

Method designed to calculate the bootstrap null-distribution of MMD2u given predefined permutations.

Parameters:
  • K – np.array, 2-D matrix

  • m – int, size of first vector

  • n – int, size of second vector

  • permutation – np.array, array of permutations

  • iterations – int, number of bootstrap iterations

Returns:

np.array, null-distribution of MMD2u given predefined permutations

Return type:

mmd2u_null

src.dackar.anomalies.kernel_two_sample_test.kernel_two_sample_test(X, Y, kernel_function='rbf', iterations=2000, verbose=False, random_state=None, alpha=0.05, thin=None, **kwargs)[source]

Method designed to calculate MMD^2_u, its null distribution and the p-value of the kernel two-sample test.

Parameters:
  • X – np.array, first vector

  • Y – np.array, second vector

  • kernel_function – string, type of kenerl function. Valid values are: additive_chi2, chi2, linear, poly, polynomial, rbf, laplacian, sigmoid, cosine

  • iterations – int, number of iterations

  • verbose – bool, flag to provide calculation details

  • random_state – np class, numpy random number generator class

  • alpha – float, acceptance value for hypothesis testing

  • thin – int, sample size for thinning calculation

  • **kwargs – dict, dictionary of parameteres that are passed to pairwise_kernels() as kernel parameters. E.g. if kernel_two_sample_test(…, kernel_function=’rbf’, gamma=0.1), then this will result in getting the kernel through kernel_function(metric=’rbf’, gamma=0.1).

Returns:

float, MMD^2_u unbiased statistic using U-statistics mmd2u_null: np.array, null-distribution of MMD2u p_value: float, calculated p-value for hypothesis testing

Return type:

mmd2u

src.dackar.anomalies.kernel_two_sample_test.chebyshevTesting(X, Y, kernel_function='rbf', iterations=2000, verbose=False, random_state=None, alpha=0.01, **kwargs)[source]

Method designed to perform Chebyshev testing using Chebyshev’s inequality

Parameters:
  • X – np.array, first vector

  • Y – np.array, second vector

  • kernel_function – string, type of kernel function

  • iterations – int, number of bootstrap iterations

  • verbose – bool, flag to provide calculation details

  • random_state – np class, numpy random number generator class

  • alpha – float, acceptance value for hypothesis testing

  • **kwargs – dict, dictionary of parameteres that are passed to pairwise_kernels() as kernel parameters.

Returns:

bool, outcome of Chebyshev testing

Return type:

accept

src.dackar.anomalies.kernel_two_sample_test.chebyshevTesting_precomputed_mmd(mmd2u, mmd2u_null, alpha=0.01)[source]

Method designed to perform MMD Chebyshev testing

Parameters:
  • mmd2u – float, MMD^2_u unbiased statistic using U-statistics

  • mmd2u_null – np.array, null-distribution of MMD2u

  • alpha – float, acceptance value for hypothesis testing

Returns:

bool, outcome of Chebyshev testing

Return type:

accept