Class Fask

java.lang.Object
edu.cmu.tetrad.search.Fask
All Implemented Interfaces:
IGraphSearch

public final class Fask extends Object implements IGraphSearch
Implements the FASK (Fast Adjacency Skewness) algorithm, which makes decisions for adjacency and orientation using a combination of conditional independence testing, judgments of nonlinear adjacency, and pairwise orientation due to non-Gaussianity. The reference is this:

Sanchez-Romero, R., Ramsey, J. D., Zhang, K., Glymour, M. R., Huang, B., and Glymour, C. (2019). Estimating feedforward and feedback effective connections from fMRI time series: Assessments of statistical methods. Network Neuroscience, 3(2), 274-30

Some adjustments have been made in some ways from that version, and some additional pairwise options have been added from this reference:

Hyvärinen, A., and Smith, S. M. (2013). Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research, 14(Jan), 111-152.

This method (and the Hyvarinen and Smith methods) make the assumption that the data are generated by a linear, non-Gaussian causal process and attempts to recover the causal graph for that process. They do not attempt to recover the parametrization of this graph; for this a separate estimation algorithm would be needed, such as linear regression regressing each node onto its parents. A further assumption is made, that there are no latent common causes of the algorithm. This is not a constraint on the pairwise orientation methods, since they orient with respect only to the two variables at the endpoints of an edge and so are happy with all other variables being considered latent with respect to that single edge. However, if the built-in adjacency search is used (FAS-Stable), the existence of latents will throw this method off.

As was shown in the Hyvarinen and Smith paper above, FASK works quite well even if the graph contains feedback loops in most configurations, including 2-cycles. 2-cycles can be detected fairly well if the FASK left-right rule is selected and the 2-cycle threshold set to about 0.1--more will be detected (or hallucinated) if the threshold is set higher. As shown in the Sanchez-Romero reference above, 2-cycle detection of the FASK algorithm using this rule is quite good.

Some edges may be undiscoverable by FAS-Stable; to recover more of these edges, a test related to the FASK left-right rule is used, and there is a threshold for this test. A good default for this threshold (the "skew edge threshold") is 0.3. For more of these edges, set this threshold to a lower number.

It is assumed that the data are arranged so the each variable forms a column and that there are no missing values. The data matrix is assumed to be rectangular. To this end, the Tetrad DataSet class is used, which enforces this.

Note that orienting a DAG for a linear, non-Gaussian model using the Hyvarinen and Smith pairwise rules is alternatively known in the literature as Pairwise LiNGAM--see Hyvärinen, A., and Smith, S. M. (2013). Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research, 14(Jan), 111-152. We include some of these methods here for comparison.

Parameters:

faskAdjacencyMethod: 1 # this run FAS-Stable (the one used in the paper). See Algorithm 2.

depth: -1. # control the size of the conditional set in the independence tests, setting this to a small integer may reduce the running time, but can also result in false positives. -1 means that it will check "all" possible sizes.

test: sem-bic-test # test for FAS adjacency

score: sem-bic-score

semBicRule: 1 # to set the Chickering Rule, used in the original Fask

penaltyDiscount: 2 # if using sem-bic as independence test (as in the paper). In the paper this is referred as c. Check step 1 and 10 in Algorithm 2 FAS stable.

skewEdgeThreshold: 0.3 # See description of Fask algorithm, and step 11 in Algorithm 1 FASK. Threshold to add edges that may have been non-inferred because there was a positive/negative cycle that result in a non-zero observed relation.

faskLeftRightRule: 1 # this run FASK v1, the original FASK from the paper

faskDelta: -0.3 # See step 1 and 11 in Algorithm 4 (this is the value set in the paper)

twoCycleScreeningThreshold: 0 # not used in the original paper implementation. Added afterwards. You can set it to 0.3, for example, to use it as a filter to run Algorithm 3 2-cycle detection, which may take some time to run.

orientationAlpha: 0.1 # this was referred in the paper as TwoCycle Alpha or just alpha, the lower it is, the lower the chance of inferring a two cycle. Check steps 17 to 28 in Algorithm 3: 2 Cycle Detection Rule.

structurePrior: 0 # prior on the number of parents. Not used in the paper implementation.

So a run of command line would look like this:

java -jar -Xmx10G causal-cmd-1.4.1-jar-with-dependencies.jar --delimiter tab --data-type continuous --dataset concat_BOLDfslfilter_60_FullMacaque.txt --prefix Fask_Test_MacaqueFull --algorithm fask --faskAdjacencyMethod 1 --depth -1 --test sem-bic-test --score sem-bic-score --semBicRule 1 --penaltyDiscount 2 --skewEdgeThreshold 0.3 --faskLeftRightRule 1 --faskDelta -0.3 --twoCycleScreeningThreshold 0 --orientationAlpha 0.1 -structurePrior 0

This class is configured to respect knowledge of forbidden and required edges, including knowledge of temporal tiers.

Version:
$Id: $Id
Author:
josephramsey, rubensanchez
See Also:
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static enum 
    Enumerates the alternatives to use for finding the initial adjacencies for FASK.
    static enum 
    Enumerates the options left-right rules to use for FASK.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Fask(DataSet dataSet, Score score, IndependenceTest test)
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    static double[]
    correctSkewness(double[] data, double sk)
    Corrects the skewness of the given data using the provided skewness value.
    static double
    faskLeftRightV1(double[] x, double[] y, boolean empirical, double delta)
    Calculates the left-right ratio using the Fask method version 1.
    static double
    faskLeftRightV2(double[] x, double[] y, boolean empirical, double delta)
    Calculates the left-right judgment for two arrays of double values.
    static double
    g(double x)
    Calculates the logarithm of the hyperbolic cosine of the maximum of x and 0.
    double[][]
    Returns the coefficient matrix for the search.
    int
    Getter for the field depth.
    long
    getElapsedTime.
    Getter for the field knowledge.
    double[][]
    Returns a matrix of left-right scores for the search.
    double
    leftRight(double[] x, double[] y)
    A left/right judgment for double[] arrays (data) as input.
    static double
    robustSkew(double[] x, double[] y, boolean empirical)
    Calculates a left-right judgment using the robust skewness between two arrays of double values.
    Runs the search on the concatenated data, returning a graph, possibly cyclic, possibly with two-cycles.
    void
    Sets the adjacency method used.
    void
    setDelta(double delta)
    Sets the delta to use.
    void
    setDepth(int depth)
    Setter for the field depth.
    void
    setEmpirical(boolean empirical)
    Sets whether the empirical option is selected.
    void
    setExternalGraph(Graph externalGraph)
    Sets the external graph to use.
    void
    Setter for the field knowledge.
    void
    Sets the left-right rule used
    void
    setOrientationAlpha(double orientationAlpha)
    Sets the orientation alpha.
    void
    setSeed(long seed)
    Sets the seed for generating random numbers.
    void
    setSkewEdgeThreshold(double skewEdgeThreshold)
    Sets the skew-edge threshold.
    void
    setTwoCycleScreeningCutoff(double twoCycleScreeningCutoff)
    Sets the cutoff for two-cycle screening.
    void
    setVerbose(boolean verbose)
    Sets the verbose mode.
    static double
    skew(double[] x, double[] y, boolean empirical)
    Calculates a left-right judgument using the skewness of two arrays of double values.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Fask

      public Fask(DataSet dataSet, Score score, IndependenceTest test)
      Constructor.
      Parameters:
      dataSet - A continuous dataset over variables V.
      test - An independence test over variables V. (Used for FAS.)
      score - a Score object
  • Method Details

    • faskLeftRightV2

      public static double faskLeftRightV2(double[] x, double[] y, boolean empirical, double delta)
      Calculates the left-right judgment for two arrays of double values. This is for version 2.
      Parameters:
      x - The data for the first variable.
      y - The data for the second variable.
      empirical - Whether to use an empirical judgment.
      delta - The delta value for the judgment.
      Returns:
      The left-right judgment, which is negative if x < y, positive if x $gt; y, and 0 if indeterminate.
    • faskLeftRightV1

      public static double faskLeftRightV1(double[] x, double[] y, boolean empirical, double delta)
      Calculates the left-right ratio using the Fask method version 1.
      Parameters:
      x - the array of values for variable x
      y - the array of values for variable y
      empirical - if true, applies empirical correction to the correlation coefficient
      delta - the threshold value for determining the sign of the left-right ratio
      Returns:
      the left-right ratio
    • robustSkew

      public static double robustSkew(double[] x, double[] y, boolean empirical)
      Calculates a left-right judgment using the robust skewness between two arrays of double values.
      Parameters:
      x - The data for the first variable.
      y - The data for the second variable.
      empirical - Whether to use an empirical correction to the skewness.
      Returns:
      The robust skewness between the two arrays.
    • skew

      public static double skew(double[] x, double[] y, boolean empirical)
      Calculates a left-right judgument using the skewness of two arrays of double values.
      Parameters:
      x - the first array of double values
      y - the second array of double values
      empirical - flag to indicate whether to apply empirical correction for skewness
      Returns:
      the skewness of the two arrays
    • g

      public static double g(double x)
      Calculates the logarithm of the hyperbolic cosine of the maximum of x and 0.
      Parameters:
      x - The input value.
      Returns:
      The result of the calculation.
    • correctSkewness

      public static double[] correctSkewness(double[] data, double sk)
      Corrects the skewness of the given data using the provided skewness value.
      Parameters:
      data - The array of data to be corrected.
      sk - The skewness value to be used for correction.
      Returns:
      The corrected data array.
    • search

      public Graph search()
      Runs the search on the concatenated data, returning a graph, possibly cyclic, possibly with two-cycles. Runs the fast adjacency search (FAS, Spirtes et al., 2000) followed by a modification of the robust skew rule (Pairwise Likelihood Ratios for Estimation of Non-Gaussian Structural Equation Models, Smith and Hyvarinen), together with some heuristics for orienting two-cycles.
      Specified by:
      search in interface IGraphSearch
      Returns:
      the graph. Some edges may be undirected (though it shouldn't be many in most cases) and some adjacencies may be two-cycles.
    • getB

      public double[][] getB()
      Returns the coefficient matrix for the search. If the search has not yet run, runs it, then estimates coefficients of each node given its parents using linear regression and forms the B matrix of coefficients from these estimates. B[i][j] != 0 means i->j with that coefficient.
      Returns:
      This matrix as a double[][] array.
    • getLrScores

      public double[][] getLrScores()
      Returns a matrix of left-right scores for the search. If lr = getLrScores(), then lr[i][j] is the left right scores leftRight(data[i], data[j]);
      Returns:
      This matrix as a double[][] array.
    • getDepth

      public int getDepth()

      Getter for the field depth.

      Returns:
      The depth of search for the Fast Adjacency Search (FAS).
    • setDepth

      public void setDepth(int depth)

      Setter for the field depth.

      Parameters:
      depth - The depth of search for the Fast Adjacency Search (S). The default is -1. Unlimited. Making this too high may result in statistical errors.
    • getElapsedTime

      public long getElapsedTime()

      getElapsedTime.

      Returns:
      The elapsed time in milliseconds.
    • getKnowledge

      public Knowledge getKnowledge()

      Getter for the field knowledge.

      Returns:
      the current knowledge.
    • setKnowledge

      public void setKnowledge(Knowledge knowledge)

      Setter for the field knowledge.

      Parameters:
      knowledge - Knowledge of forbidden and required edges.
    • setExternalGraph

      public void setExternalGraph(Graph externalGraph)
      Sets the external graph to use. This graph will be used as a set of adjacencies to be included in the graph is the "external graph" options is selected. It doesn't matter what the orientations of the graph are; the graph will be reoriented using the left-right rule selected.
      Parameters:
      externalGraph - This graph.
    • setSkewEdgeThreshold

      public void setSkewEdgeThreshold(double skewEdgeThreshold)
      Sets the skew-edge threshold.
      Parameters:
      skewEdgeThreshold - This threshold.
    • setTwoCycleScreeningCutoff

      public void setTwoCycleScreeningCutoff(double twoCycleScreeningCutoff)
      Sets the cutoff for two-cycle screening.
      Parameters:
      twoCycleScreeningCutoff - This cutoff.
    • setOrientationAlpha

      public void setOrientationAlpha(double orientationAlpha)
      Sets the orientation alpha.
      Parameters:
      orientationAlpha - This alpha.
    • setLeftRight

      public void setLeftRight(Fask.LeftRight leftRight)
      Sets the left-right rule used
      Parameters:
      leftRight - This rule.
      See Also:
    • setAdjacencyMethod

      public void setAdjacencyMethod(Fask.AdjacencyMethod adjacencyMethod)
      Sets the adjacency method used.
      Parameters:
      adjacencyMethod - This method.
      See Also:
    • setDelta

      public void setDelta(double delta)
      Sets the delta to use.
      Parameters:
      delta - This delta.
    • setEmpirical

      public void setEmpirical(boolean empirical)
      Sets whether the empirical option is selected.
      Parameters:
      empirical - True, if so.
    • leftRight

      public double leftRight(double[] x, double[] y)
      A left/right judgment for double[] arrays (data) as input.
      Parameters:
      x - The data for the first variable.
      y - The data for the second variable.
      Returns:
      The left-right judgment, which is negative if X<-Y, positive if X->Y, and 0 if indeterminate.
    • setSeed

      public void setSeed(long seed)
      Sets the seed for generating random numbers.
      Parameters:
      seed - the seed value to set
    • setVerbose

      public void setVerbose(boolean verbose)
      Sets the verbose mode.
      Parameters:
      verbose - the flag indicating whether to enable verbose mode or not