Package edu.cmu.tetrad.search.test
Class ConditionalCorrelationIndependence
java.lang.Object
edu.cmu.tetrad.search.test.ConditionalCorrelationIndependence
- All Implemented Interfaces:
- RowsSettable
Checks conditional independence of variable in a continuous data set using Daudin's method. See
 
Ramsey, J. D. (2014). A scalable conditional independence test for nonlinear, non-Gaussian data. arXiv preprint arXiv:1401.5031.
This is corrected using Lemma 2, condition 4 of
Zhang, K., Peters, J., Janzing, D., and Schölkopf, B. (2012). Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv:1202.3775.
This all follows the original Daudin paper, which is this:
Daudin, J. J. (1980). Partial association measures and an application to qualitative regression. Biometrika, 67(3), 581-590.
Updated 2024-11-24 josephramsey
- Author:
- josephramsey
- 
Constructor SummaryConstructorsConstructorDescriptionConditionalCorrelationIndependence(DataSet dataSet, int basisType, double basisScale, int numFunctions) Initializes a new instance of the ConditionalCorrelationIndependence class using the provided DataSet.
- 
Method SummaryModifier and TypeMethodDescriptionintRetrieves the number of functions used in the ConditionalCorrelationIndependence analysis.doublegetPValue(double score) Calculates the p-value for a given score using the cumulative distribution function (CDF) of a standard normal distribution.getRows()Retrieves the list of row indices currently set for the analysis.doubleRetrieves the kernel scaling factor.doubleisIndependent(Node x, Node y, Set<Node> _z) Determines whether two given nodes are independent given a set of conditioning nodes, and calculates a score.doublepermutationTest(Node x, Node y, Set<Node> z, int numPermutations) Performs a permutation test to empirically determine the distribution of p-values under the null hypothesis.voidsetNumFunctions(int numFunctions) Sets the number of functions used in the ConditionalCorrelationIndependence analysis.voidSets the list of row indicesvoidsetScalingFactor(double scalingFactor) Sets the bandwidth adjustment value for the ConditionalCorrelationIndependence analysis.
- 
Constructor Details- 
ConditionalCorrelationIndependencepublic ConditionalCorrelationIndependence(DataSet dataSet, int basisType, double basisScale, int numFunctions) Initializes a new instance of the ConditionalCorrelationIndependence class using the provided DataSet.- Parameters:
- dataSet- The dataset to be used for the analysis. This dataset must not be null and will be standardized.
- basisType- The type of basis function to be used in the analysis. This value must be a positive integer.
- basisScale- The scaling factor used to adjust the bandwidth for the analysis, or 0.0 if the data should be standardized.
- numFunctions- The number of functions to be used in the analysis. This value must be a positive integer.
- Throws:
- NullPointerException- if the provided dataset is null.
 
 
- 
- 
Method Details- 
isIndependentDetermines whether two given nodes are independent given a set of conditioning nodes, and calculates a score.- Parameters:
- x- The first node.
- y- The second node.
- _z- The set of conditioning nodes.
- Returns:
- The score representing the level of independence between nodes x and y given the conditioning set _z. Returns Double.NaN if the score cannot be computed or is not a number.
 
- 
getPValuepublic double getPValue(double score) Calculates the p-value for a given score using the cumulative distribution function (CDF) of a standard normal distribution.- Parameters:
- score- The score for which the p-value needs to be calculated. This score is typically a test statistic resulting from some statistical test.
- Returns:
- The p-value corresponding to the given score, indicating the probability of obtaining a value at least as extreme as the observed score under the null hypothesis.
 
- 
getRowsRetrieves the list of row indices currently set for the analysis. If no rows are set, return a list of all row indices.- Specified by:
- getRowsin interface- RowsSettable
- Returns:
- A list of row indices.
 
- 
setRowsSets the list of row indices- Specified by:
- setRowsin interface- RowsSettable
- Parameters:
- rows- The list of row indices to set.
 
- 
getNumFunctionspublic int getNumFunctions()Retrieves the number of functions used in the ConditionalCorrelationIndependence analysis.- Returns:
- The number of functions used in the analysis.
 
- 
setNumFunctionspublic void setNumFunctions(int numFunctions) Sets the number of functions used in the ConditionalCorrelationIndependence analysis.- Parameters:
- numFunctions- the number of functions to set. This value must be a positive integer.
 
- 
getScalingFactorpublic double getScalingFactor()Retrieves the kernel scaling factor.- Returns:
- The scaling factor used in the analysis.
 
- 
setScalingFactorpublic void setScalingFactor(double scalingFactor) Sets the bandwidth adjustment value for the ConditionalCorrelationIndependence analysis.Default is 2. - Parameters:
- scalingFactor- The new bandwidth adjustment factor to be used. This value adjusts the bandwidth calculation for conditional independence tests and impacts the sensitivity of the kernel-based analysis.
 
- 
permutationTestPerforms a permutation test to empirically determine the distribution of p-values under the null hypothesis.- Parameters:
- x- The first node.
- y- The second node.
- z- The set of conditioning nodes.
- numPermutations- The number of permutations to perform.
- Returns:
- The mean p-value for the given number of permutations.
 
 
-