Package edu.cmu.tetrad.search.test
Class Kci
java.lang.Object
edu.cmu.tetrad.search.test.Kci
- All Implemented Interfaces:
RawMarginalIndependenceTest,IndependenceTest
The Kci class implements the Kernel-based Conditional Independence (KCI) test for statistical independence between
variables. It supports various kernel types (e.g., Gaussian, Polynomial, Linear) and provides both Gamma
approximation as well as permutation-based p-value computation. This class utilizes kernel matrices and bandwidth
selection heuristics for efficient statistical test computation.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumEnum representing the type of kernel function used in kernel-based computations. -
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionConstructs a Kci instance with the given DataSet.Kci(org.ejml.simple.SimpleMatrix dataVxN, Map<Node, Integer> varToRow, org.ejml.simple.SimpleMatrix hHint, List<Integer> rows) Constructs a Kci instance using specified data, variable-to-row mapping, an optional hint matrix, and a list of row indices. -
Method Summary
Modifier and TypeMethodDescriptioncheckIndependence(Node x, Node y, Set<Node> z) Tests the conditional independence of two given variables (x and y) with respect to a set of conditioning variables (z) using the KCI (Kernel-based Conditional Independence) method.doublecomputePValue(double[] x, double[] y) Computes the p-value for testing the independence of two variables represented by the input arrays.doublecomputePValueFromCenteredKernels(org.ejml.simple.SimpleMatrix centeredKx, org.ejml.simple.SimpleMatrix centeredKy) Computes the p-value from two centered kernel matrices using statistical methods.doublegetAlpha()Retrieves the value of the alpha threshold, which is generally used for statistical tests to determine the significance or rejection criteria.getData()Retrieves the data model associated with the current instance.doubleRetrieves the epsilon value.Retrieves the kernel type.intRetrieves the number of permutations to be used in permutation tests.doubleRetrieves the coefficient of the polynomial for the term of degree 0.intRetrieves the degree of the polynomial.doubleRetrieves the gamma parameter for the polynomial kernel.doubleRetrieves the scaling factor for the Gaussian bandwidth heuristic.Retrieves the list of variables associated with the current instance.booleanRetrieves whether the method should use an approximate approach or a permutation test.doubleisIndependenceConditional(Node x, Node y, List<Node> z, double alpha) Tests for conditional independence between two variables given a set of conditioning variables.booleanIndicates whether verbose mode is enabled.voidsetAlpha(double alpha) Sets the value of the alpha threshold, which is typically used for statistical testing to determine the significance level or rejection criteria.voidsetApproximate(boolean approximate) Sets whether the method should use an approximate approach or a permutation test.voidsetEpsilon(double epsilon) Sets the epsilon value.voidsetKernelType(Kci.KernelType kernelType) Sets the kernel type.voidsetNumPermutations(int numPermutations) Sets the number of permutations to be used in permutation tests.voidsetPolyCoef0(double polyCoef0) Sets the value of the polynomial coefficient at index 0.voidsetPolyDegree(int polyDegree) Sets the degree of the polynomial.voidsetPolyGamma(double polyGamma) Sets the polyGamma value.voidsetScalingFactor(double scalingFactor) Sets the scaling factor for the Gaussian bandwidth heuristic.voidsetVerbose(boolean verbose) Sets the verbose mode for the current instance.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface edu.cmu.tetrad.search.test.IndependenceTest
checkIndependence, determines, getCov, getDataSets, getSampleSize, getVariable, getVariableNames, indTestSubset, toStringMethods inherited from interface edu.cmu.tetrad.search.RawMarginalIndependenceTest
computePValue
-
Field Details
-
rng
RNG for permutations; can be null (seeded later).
-
-
Constructor Details
-
Kci
Constructs a Kci instance with the given DataSet.- Parameters:
dataSet- the dataset containing the data to be analyzed. It is used to initialize the data matrix, variable list, and other attributes.
-
Kci
public Kci(org.ejml.simple.SimpleMatrix dataVxN, Map<Node, Integer> varToRow, org.ejml.simple.SimpleMatrix hHint, List<Integer> rows) Constructs a Kci instance using specified data, variable-to-row mapping, an optional hint matrix, and a list of row indices. This constructor initializes the internal fields required for kernel-based independence testing.- Parameters:
dataVxN- a SimpleMatrix representing the data matrix where rows correspond to variables and columns correspond to observations.varToRow- a map from Node instances to integer indices, specifying the row mapping for variables.hHint- a SimpleMatrix used as a hint for the kernel computation, often representing precomputed or auxiliary data; can be null if not applicable.rows- a list of integers representing the indices of rows to be used in the computation.
-
-
Method Details
-
getAlpha
public double getAlpha()Retrieves the value of the alpha threshold, which is generally used for statistical tests to determine the significance or rejection criteria.- Specified by:
getAlphain interfaceIndependenceTest- Returns:
- the value of alpha as a double.
-
setAlpha
public void setAlpha(double alpha) Sets the value of the alpha threshold, which is typically used for statistical testing to determine the significance level or rejection criteria.- Specified by:
setAlphain interfaceIndependenceTest- Parameters:
alpha- the value of alpha to set, represented as a double.
-
checkIndependence
public IndependenceResult checkIndependence(Node x, Node y, Set<Node> z) throws InterruptedException Tests the conditional independence of two given variables (x and y) with respect to a set of conditioning variables (z) using the KCI (Kernel-based Conditional Independence) method. This method evaluates whether x and y are independent given z by calculating a p-value and comparing it against the alpha threshold.- Specified by:
checkIndependencein interfaceIndependenceTest- Parameters:
x- the first variable to be tested for independence, represented as a Node.y- the second variable to be tested for independence, represented as a Node.z- the set of conditioning variables, represented as a Set of Node objects.- Returns:
- an IndependenceResult object containing the results of the independence test, including the independence fact, the p-value, and additional statistical details.
- Throws:
InterruptedException- if the thread executing the method is interrupted during execution.
-
getVariables
Retrieves the list of variables associated with the current instance. This method returns a new list containing the variables, ensuring that modifications to the returned list do not affect the original list.- Specified by:
getVariablesin interfaceIndependenceTest- Returns:
- a List of Node objects representing the variables.
-
getData
Retrieves the data model associated with the current instance.- Specified by:
getDatain interfaceIndependenceTest- Returns:
- the DataModel object representing the dataset being analyzed.
-
isVerbose
public boolean isVerbose()Indicates whether verbose mode is enabled.- Specified by:
isVerbosein interfaceIndependenceTest- Returns:
- true if verbose mode is enabled, false otherwise
-
setVerbose
public void setVerbose(boolean verbose) Sets the verbose mode for the current instance.- Specified by:
setVerbosein interfaceIndependenceTest- Parameters:
verbose- True, if so.
-
isIndependenceConditional
Tests for conditional independence between two variables given a set of conditioning variables. This method computes a test statistic and its corresponding p-value using either an approximate method or a permutation-based method depending on the configuration.- Parameters:
x- The first variable to test for independence.y- The second variable to test for independence.z- The list of conditioning variables.alpha- The significance level used for the independence test.- Returns:
- The p-value of the conditional independence test. A small p-value (less than alpha) indicates that x and y are not conditionally independent given z.
- Throws:
NullPointerException- If x or y is null.
-
computePValue
public double computePValue(double[] x, double[] y) Computes the p-value for testing the independence of two variables represented by the input arrays. The method utilizes a kernel-based conditional independence test (KCI) provided by BFIT.- Specified by:
computePValuein interfaceRawMarginalIndependenceTest- Parameters:
x- the first array of observed values representing one variable. It must not be null and should contain at least three elements.y- the second array of observed values representing another variable. It must not be null, should contain at least three elements, and have the same length as the first array.- Returns:
- the computed p-value as a double. A result closer to 0 suggests stronger evidence against the null hypothesis of independence, while a value close to 1 supports independence. If the input arrays are invalid or if an error occurs, the method returns 1.0.
-
computePValueFromCenteredKernels
public double computePValueFromCenteredKernels(org.ejml.simple.SimpleMatrix centeredKx, org.ejml.simple.SimpleMatrix centeredKy) Computes the p-value from two centered kernel matrices using statistical methods. Depending on whether an approximate or exact method is specified, it calculates the p-value using a gamma distribution or a permutation test.- Parameters:
centeredKx- A centered kernel matrix (n x n) representing one dataset.centeredKy- A centered kernel matrix (n x n) representing another dataset.- Returns:
- The computed p-value indicating the statistical relationship between the two datasets.
- Throws:
IllegalArgumentException- If the provided matrices are not square and of the same dimensions (n x n).
-
getPolyDegree
public int getPolyDegree()Retrieves the degree of the polynomial.- Returns:
- the degree of the polynomial as an integer
-
setPolyDegree
public void setPolyDegree(int polyDegree) Sets the degree of the polynomial.- Parameters:
polyDegree- the degree of the polynomial to be set
-
getPolyCoef0
public double getPolyCoef0()Retrieves the coefficient of the polynomial for the term of degree 0.- Returns:
- the value of the polynomial coefficient for the term of degree 0
-
setPolyCoef0
public void setPolyCoef0(double polyCoef0) Sets the value of the polynomial coefficient at index 0.- Parameters:
polyCoef0- the value to set for the polynomial coefficient at index 0
-
getPolyGamma
public double getPolyGamma()Retrieves the gamma parameter for the polynomial kernel.- Returns:
- the gamma parameter for the polynomial kernel
-
setPolyGamma
public void setPolyGamma(double polyGamma) Sets the polyGamma value.- Parameters:
polyGamma- the value to set for the polyGamma property
-
getKernelType
Retrieves the kernel type.- Returns:
- the kernel type
-
setKernelType
Sets the kernel type.- Parameters:
kernelType- the kernel type to set
-
getEpsilon
public double getEpsilon()Retrieves the epsilon value.- Returns:
- the epsilon value
-
setEpsilon
public void setEpsilon(double epsilon) Sets the epsilon value.- Parameters:
epsilon- the epsilon value to set
-
getScalingFactor
public double getScalingFactor()Retrieves the scaling factor for the Gaussian bandwidth heuristic.- Returns:
- the scaling factor
-
setScalingFactor
public void setScalingFactor(double scalingFactor) Sets the scaling factor for the Gaussian bandwidth heuristic. The scaling factor is used to modify the bandwidth by scaling it multiplicatively (sigma *= scalingFactor).- Parameters:
scalingFactor- the scaling factor to set; a multiplier for the Gaussian bandwidth heuristic.
-
isApproximate
public boolean isApproximate()Retrieves whether the method should use an approximate approach or a permutation test.- Returns:
- true if approximate method is used, false if permutation test is used
-
setApproximate
public void setApproximate(boolean approximate) Sets whether the method should use an approximate approach or a permutation test.- Parameters:
approximate- true to use approximate method, false to use permutation test
-
getNumPermutations
public int getNumPermutations()Retrieves the number of permutations to be used in permutation tests.- Returns:
- the number of permutations to be used in permutation tests
-
setNumPermutations
public void setNumPermutations(int numPermutations) Sets the number of permutations to be used in permutation tests.- Parameters:
numPermutations- the number of permutations to set, typically used when conducting statistical tests that involve random shuffling of data to approximate a distribution.
-