Package edu.cmu.tetrad.search.test
Class ConditionalCorrelationIndependence
java.lang.Object
edu.cmu.tetrad.search.test.ConditionalCorrelationIndependence
- All Implemented Interfaces:
RowsSettable
Checks conditional independence of variable in a continuous data set using Daudin's method. See
Ramsey, J. D. (2014). A scalable conditional independence test for nonlinear, non-Gaussian data. arXiv preprint arXiv:1401.5031.
This is corrected using Lemma 2, condition 4 of
Zhang, K., Peters, J., Janzing, D., and Schölkopf, B. (2012). Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv:1202.3775.
This all follows the original Daudin paper, which is this:
Daudin, J. J. (1980). Partial association measures and an application to qualitative regression. Biometrika, 67(3), 581-590.
Updated 2024-11-24 josephramsey
- Author:
- josephramsey
-
Constructor Summary
ConstructorsConstructorDescriptionConditionalCorrelationIndependence
(DataSet dataSet, int basisType, double basisScale, int numFunctions) Initializes a new instance of the ConditionalCorrelationIndependence class using the provided DataSet. -
Method Summary
Modifier and TypeMethodDescriptionint
Retrieves the number of functions used in the ConditionalCorrelationIndependence analysis.double
getPValue
(double score) Calculates the p-value for a given score using the cumulative distribution function (CDF) of a standard normal distribution.getRows()
Retrieves the list of row indices currently set for the analysis.double
Retrieves the kernel scaling factor.double
isIndependent
(Node x, Node y, Set<Node> _z) Determines whether two given nodes are independent given a set of conditioning nodes, and calculates a score.double
permutationTest
(Node x, Node y, Set<Node> z, int numPermutations) Performs a permutation test to empirically determine the distribution of p-values under the null hypothesis.void
setNumFunctions
(int numFunctions) Sets the number of functions used in the ConditionalCorrelationIndependence analysis.void
Sets the list of row indicesvoid
setScalingFactor
(double scalingFactor) Sets the bandwidth adjustment value for the ConditionalCorrelationIndependence analysis.
-
Constructor Details
-
ConditionalCorrelationIndependence
public ConditionalCorrelationIndependence(DataSet dataSet, int basisType, double basisScale, int numFunctions) Initializes a new instance of the ConditionalCorrelationIndependence class using the provided DataSet.- Parameters:
dataSet
- The dataset to be used for the analysis. This dataset must not be null and will be standardized.basisType
- The type of basis function to be used in the analysis. This value must be a positive integer.basisScale
- The scaling factor used to adjust the bandwidth for the analysis, or 0.0 if the data should be standardized.numFunctions
- The number of functions to be used in the analysis. This value must be a positive integer.- Throws:
NullPointerException
- if the provided dataset is null.
-
-
Method Details
-
isIndependent
Determines whether two given nodes are independent given a set of conditioning nodes, and calculates a score.- Parameters:
x
- The first node.y
- The second node._z
- The set of conditioning nodes.- Returns:
- The score representing the level of independence between nodes x and y given the conditioning set _z. Returns Double.NaN if the score cannot be computed or is not a number.
-
getPValue
public double getPValue(double score) Calculates the p-value for a given score using the cumulative distribution function (CDF) of a standard normal distribution.- Parameters:
score
- The score for which the p-value needs to be calculated. This score is typically a test statistic resulting from some statistical test.- Returns:
- The p-value corresponding to the given score, indicating the probability of obtaining a value at least as extreme as the observed score under the null hypothesis.
-
getRows
Retrieves the list of row indices currently set for the analysis. If no rows are set, return a list of all row indices.- Specified by:
getRows
in interfaceRowsSettable
- Returns:
- A list of row indices.
-
setRows
Sets the list of row indices- Specified by:
setRows
in interfaceRowsSettable
- Parameters:
rows
- The list of row indices to set.
-
getNumFunctions
public int getNumFunctions()Retrieves the number of functions used in the ConditionalCorrelationIndependence analysis.- Returns:
- The number of functions used in the analysis.
-
setNumFunctions
public void setNumFunctions(int numFunctions) Sets the number of functions used in the ConditionalCorrelationIndependence analysis.- Parameters:
numFunctions
- the number of functions to set. This value must be a positive integer.
-
getScalingFactor
public double getScalingFactor()Retrieves the kernel scaling factor.- Returns:
- The scaling factor used in the analysis.
-
setScalingFactor
public void setScalingFactor(double scalingFactor) Sets the bandwidth adjustment value for the ConditionalCorrelationIndependence analysis.Default is 2.
- Parameters:
scalingFactor
- The new bandwidth adjustment factor to be used. This value adjusts the bandwidth calculation for conditional independence tests and impacts the sensitivity of the kernel-based analysis.
-
permutationTest
Performs a permutation test to empirically determine the distribution of p-values under the null hypothesis.- Parameters:
x
- The first node.y
- The second node.z
- The set of conditioning nodes.numPermutations
- The number of permutations to perform.- Returns:
- The mean p-value for the given number of permutations.
-