Class IndTestFdrWrapper
- All Implemented Interfaces:
IndependenceTest
IndependenceTest that enforces False Discovery Rate (FDR) control on independence
decisions. This class uses either Benjamini-Hochberg (BH) or Benjamini-Yekutieli (BY) FDR control methods and
supports both global and stratified FDR control across conditioning sets.
The workflow is divided into two phases: 1. Recording Epoch: Raw p-values for independence tests are cached from the underlying test. 2. Decision Epoch: Enforces FDR-controlled cutoffs based on cached p-values.
The cutoffs can be computed globally or within groups based on the cardinality of the conditioning set (|Z|).
This class also maintains a mechanism to track "mind-changes," i.e., decisions that change between subsequent algorithm passes.
The wrapper allows for integrating FDR-controlled independence testing into iterative search algorithms without re-computing p-values across epochs, ensuring reproducibility and efficiency.
-
Constructor Summary
ConstructorsConstructorDescriptionIndTestFdrWrapper(IndependenceTest base, boolean negativelyCorrelated, double alpha, double fdrQ) Constructs an instance of IndTestFdrWrapper, which wraps around an existingIndependenceTestto apply False Discovery Rate (FDR) control during independence testing. -
Method Summary
Modifier and TypeMethodDescriptioncheckIndependence(Node x, Node y, Set<Node> z) Checks the independence of two nodes given a conditioning set of nodes.voidComputes the adjusted alpha threshold (alphaStar) for controlling the False Discovery Rate (FDR) during independence tests.intReturns the number of mind-changes between the previous decision epoch and the current one.Provides a summary of the current FDR cutoff parameters in a formatted string.static GraphdoFdrLoop(IGraphSearch search, boolean negativelyCorrelated, double alpha, double fdrQ, boolean verbose) Executes a loop for controlling the false discovery rate (FDR) as part of a graph search process.getData()Retrieves the data model associated with the wrappedIndependenceTest.Retrieves the data sets used in the underlying independence test.intRetrieves the sample size used in the underlying independence test.Retrieves a list of variables involved in the independence tests.booleanDetermines whether verbose output is enabled in the underlying independence test wrapped by this instance.voidsetVerbose(boolean verbose) Sets the verbose output flag for the underlying independence test.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface edu.cmu.tetrad.search.test.IndependenceTest
checkIndependence, determines, getAlpha, getCov, getVariable, getVariableNames, indTestSubset, setAlpha, toString
-
Constructor Details
-
IndTestFdrWrapper
public IndTestFdrWrapper(IndependenceTest base, boolean negativelyCorrelated, double alpha, double fdrQ) Constructs an instance of IndTestFdrWrapper, which wraps around an existingIndependenceTestto apply False Discovery Rate (FDR) control during independence testing. The wrapper enforces an FDR threshold using the given parameters, allowing for control over the proportion of false positives in the testing process.- Parameters:
base- The underlyingIndependenceTestobject to be wrapped, which performs the actual independence tests.negativelyCorrelated- A flag indicating whether to focus on negatively correlated variables during the FDR process.alpha- The base significance level for the independence tests. Must be in the range [0, 1].fdrQ- The FDR threshold parameter, representing the desired upper bound on the proportion of false discoveries. Must be in the range [0, 1].- Throws:
NullPointerException- If the base independence test is null.IllegalArgumentException- If the alpha is not in the range [0, 1] or the fdrQ is not in the range [0, 1].
-
-
Method Details
-
doFdrLoop
public static Graph doFdrLoop(IGraphSearch search, boolean negativelyCorrelated, double alpha, double fdrQ, boolean verbose) throws InterruptedException Executes a loop for controlling the false discovery rate (FDR) as part of a graph search process. The method iteratively adjusts the FDR threshold (alphaStar) based on accumulated p-values from independence tests and continues until the number of changes (new facts or flips) between epochs falls below a specified threshold or the maximum number of epochs is reached.- Parameters:
search- The graph search instance used for discovering dependencies and independencies.negativelyCorrelated- A flag indicating whether to consider only negatively correlated variables.alpha- The base significance level for independence tests.fdrQ- The false discovery rate (FDR) threshold parameter.verbose- A flag to control whether detailed logs should be displayed during the process.- Returns:
- The resulting graph discovered after applying the FDR control mechanism.
- Throws:
InterruptedException- If the process is interrupted during execution.
-
computeAlphaStar
public void computeAlphaStar()Computes the adjusted alpha threshold (alphaStar) for controlling the False Discovery Rate (FDR) during independence tests. This method collects the p-values for all tested hypotheses, applies an FDR cutoff algorithm using the specified FDR threshold (fdrQ) and the negatively correlated parameter, and updates the alphaStar field with the newly computed value.This adjustment ensures that the proportion of false discoveries among rejected hypotheses is controlled according to the specified FDR threshold.
The p-values are retrieved from the `pvals` field, which maps independence facts to their p-values. The computed alphaStar value is stored for subsequent use in decision-making processes.
-
countMindChanges
public int countMindChanges()Returns the number of mind-changes between the previous decision epoch and the current one. Call this AFTER you complete an algorithm pass using this wrapper in decision mode.- Returns:
- This number.
-
checkIndependence
public IndependenceResult checkIndependence(Node x, Node y, Set<Node> z) throws InterruptedException Checks the independence of two nodes given a conditioning set of nodes. This method uses cached p-values to optimize redundant independence tests. If a p-value is not available in the cache, it computes it using the underlying independence test and then stores the result for future use.- Specified by:
checkIndependencein interfaceIndependenceTest- Parameters:
x- The first node being tested for independence.y- The second node being tested for independence.z- The set of nodes conditioned upon, representing the known variables.- Returns:
- An
IndependenceResultobject containing the independence determination, the corresponding p-value, and the difference between the alpha threshold and the p-value. - Throws:
InterruptedException- If the process is interrupted during execution.
-
getVariables
Retrieves a list of variables involved in the independence tests.- Specified by:
getVariablesin interfaceIndependenceTest- Returns:
- A list of
Nodeobjects representing the variables.
-
getData
Retrieves the data model associated with the wrappedIndependenceTest.- Specified by:
getDatain interfaceIndependenceTest- Returns:
- A
DataModelobject representing the data model used in the underlying independence test.
-
isVerbose
public boolean isVerbose()Determines whether verbose output is enabled in the underlying independence test wrapped by this instance.- Specified by:
isVerbosein interfaceIndependenceTest- Returns:
- true if verbose output is enabled; false otherwise.
-
setVerbose
public void setVerbose(boolean verbose) Sets the verbose output flag for the underlying independence test.- Specified by:
setVerbosein interfaceIndependenceTest- Parameters:
verbose- True, if so.
-
getSampleSize
public int getSampleSize()Retrieves the sample size used in the underlying independence test.- Specified by:
getSampleSizein interfaceIndependenceTest- Returns:
- The sample size as an integer.
-
getDataSets
Retrieves the data sets used in the underlying independence test.- Specified by:
getDataSetsin interfaceIndependenceTest- Returns:
- A list of
DataSetobjects representing the data sets.
-
cutoffsSummary
Provides a summary of the current FDR cutoff parameters in a formatted string.- Returns:
- A string summarizing the false discovery rate (FDR) parameters including whether the data is negatively or positively correlated, the FDR q-value, the number of variables (m), and the computed alpha threshold (α*).
-